Releases: deepset-ai/haystack
v2.30.2-rc1
🐛 Bug Fixes
- Fixed the
Agentexiting prematurely under the defaultexit_conditions=["text"]. The agent now only stops when the last message is an assistant message with non-empty text (or when no tool invoker is configured). Previously, if the LLM produced an invalid tool call that was discarded, the resulting assistant message with empty text and no tool calls would trigger an exit, preventing the agent from recovering. The agent now continues looping so the model can recover on the next iteration.
v2.30.1
⚡️ Enhancement Notes
AzureOpenAIChatGeneratornow accepts aSecretfor theazure_endpointandapi_versionparameters in addition to a plain string. This makes it possible to resolve these values from environment variables at runtime, for example withSecret.from_env_var("AZURE_OPENAI_ENDPOINT"), so the same serialized pipeline can switch between environments (e.g. dev and prod) by changing environment variables instead of the pipeline definition.
v2.30.1-rc1
v2.30.1-rc1
v2.30.0
⭐️ Highlights
🐍 Syntax-aware Python code splitting with PythonCodeSplitter
The new PythonCodeSplitter is a syntax-aware splitter for Python source files, built for code-RAG and code-search pipelines where naive line-based splitting tends to cut through functions and lose structural context. It parses sources with the ast module and greedily merges units, such as module docstring, import blocks, top-level functions, class headers, methods, and nested classes, into chunks of roughly max_effective_lines, keeping whole functions and methods together. For functions that exceed oversized_factor * max_effective_lines, it falls back to a line-based secondary split with overlap.
Two options make the resulting chunks more useful downstream: strip_docstrings=True moves docstrings into chunk metadata, and preserve_class_definition=True prepends the enclosing class signature to chunks whose members live in a later chunk. Each chunk also carries rich metadata including start_line, end_line, unit_kinds, include_classes, decorators, docstrings, source_id, and split_id.
from haystack.components.preprocessors import PythonCodeSplitter
splitter = PythonCodeSplitter(
max_effective_lines=80,
strip_docstrings=True,
preserve_class_definition=True,
)
result = splitter.run(documents=[doc])💬 Pass a plain string to any ChatGenerator
All Haystack ChatGenerator components now accept a plain string for the messages parameter in addition to a list of ChatMessage objects. The string is automatically wrapped in a ChatMessage with the user role. This makes switching from a Generator to a ChatGenerator a one-line change. The change applies to AzureOpenAIChatGenerator, AzureOpenAIResponsesChatGenerator, FallbackChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, OpenAIChatGenerator, and OpenAIResponsesChatGenerator, and will soon be rolled out to the ChatGenerators in Haystack Core Integrations.
from haystack.components.generators.chat import OpenAIChatGenerator
generator = OpenAIChatGenerator()
# passing a string is equivalent to passing [ChatMessage.from_user("...")]
response = generator.run("What's Natural Language Processing?")
print(response["replies"][0].text)⬆️ Upgrade Notes
-
DALLEImageGeneratorhas been updated to account for OpenAI's retirement of the DALL-E models. The default model is nowgpt-image-2(previouslydall-e-3). To migrate:- Update
modelvalue: besidesgpt-image-2,gpt-image-1andgpt-image-1-miniare also supported. - Update
qualityvalue: the new accepted values areauto,high,medium, orlow(previouslystandardorhd). - Update
sizevalue: the new accepted values are1024x1024,1024x1536,1536x1024, orauto.gpt-image-2also supports arbitrary sizes. - The
response_formatparameter is now ignored. The component always returns base64-encoded JSON.
# Before llm.run([message], my_callback) # After llm.run(messages=[message], streaming_callback=my_callback)
- Update
🚀 New Features
-
Introduced the
PythonCodeSplittercomponent, a syntax-aware splitter for Python source files:- Parses sources with the
astmodule and merges units (module docstring, import blocks, top-level functions, class headers, methods, nested classes, and remaining statements) greedily into chunks of roughlymax_effective_lines. - Keeps whole functions and methods together; falls back to a line-based secondary split (using
DocumentSplitter) with overlap only for functions whose effective length exceedsoversized_factor * max_effective_lines. - Optionally strips docstrings into chunk metadata via
strip_docstrings=True, and prepends the enclosing class signature to chunks whose members live in a later chunk viapreserve_class_definition=True. - Emits per-chunk metadata including
start_line,end_line,unit_kinds,include_classes,decorators,docstrings,source_id, andsplit_id.
- Parses sources with the
-
All Haystack
ChatGeneratorcomponents now also accept a plain string for themessagesparameter in addition to a list ofChatMessageobjects. The string is automatically converted into a list containing aChatMessagewith theuserrole. This is done to simplify switching from Generators to ChatGenerators; Generators might be removed in Haystack 3.0.This applies to
AzureOpenAIChatGenerator,AzureOpenAIResponsesChatGenerator,FallbackChatGenerator,HuggingFaceAPIChatGenerator,HuggingFaceLocalChatGenerator,OpenAIChatGenerator, andOpenAIResponsesChatGenerator.The same change will be soon applied to ChatGenerators available in Haystack Core Integrations.
Example:
from haystack.components.generators.chat import OpenAIChatGenerator generator = OpenAIChatGenerator() # passing a string is equivalent to passing [ChatMessage.from_user("...")] response = generator.run("What's Natural Language Processing?") print(response["replies"][0].text)
⚡️ Enhancement Notes
- Added
run_asynctoTextEmbeddingRetriever,MultiQueryEmbeddingRetriever, andMultiQueryTextRetriever. These components now execute natively as coroutines inAsyncPipeline, delegating to each wrapped component'srun_asyncwhen available and falling back to a thread executor otherwise. - Fix grammar in the
AzureOpenAIGeneratorandAzureOpenAIChatGeneratordocstring code examples ("<this a model name..."→"<this is a model name...") so that copy-pasted snippets read correctly. - Update
ToolsTypeto improve type checking for thetoolsparameter. Any class that inherits from eitherToolorToolsetis now accepted in any sequence (list, tuple, etc). Pipeline.draw()andPipeline.show()now validate the Mermaid server response before writing it to disk. The response body is checked against the expected output format (PNG, JPEG, WebP, SVG, or PDF) via its magic-byte signature, and theContent-Typeheader is checked as well. If the response is empty or does not match the requested format, aPipelineDrawingErroris raised and no file is written. This prevents a misconfigured or untrustedserver_urlfrom causing arbitrary content (for example an HTML error page) to be saved verbatim to the output path.
🐛 Bug Fixes
- Prevent
Document.from_dict()from mutating the input dictionary during deserialization. - Prevent DocumentLanguageClassifier from crashing when
Document.content=Noneby marking them as unmatched and logging a warning. - Fixed a bug where
Agentwould not exit when the model emitted multiple tool calls in a single turn and the configured exit-condition tool was not the first one in the list. Previously, only the first tool call in each assistant message was checked againstexit_conditions, so a reply like[search, finish](withexit_conditions=["finish"]) would silently fail to stop the loop and keep iterating untilmax_agent_stepswas reached. Since parallel tool calls are now the norm for frontier models, this could quietly turn a single successful turn into dozens of wasted LLM calls. TheAgentnow inspects every tool call in the message, so the exit condition is honored regardless of ordering. - Fix
AnswerBuilder.run()mutating themetadict of inputDocumentobjects.source_index(andreferencedwhenreference_patternis set) are now only added to the document copies insideGeneratedAnswer.documents, not to the originals. - Fixed
DocumentJoinerinconcatenatemode so that documents with a score of exactly0.0are no longer treated as unscored during deduplication. Previously a truthiness check coercedscore=0.0to-inf, which could cause a worse, negatively-scored duplicate to be kept instead of the0.0-scored document. Themergemode was updated to the same explicitis not Nonecheck for consistency; its observable behavior is unchanged. - Fixed in-place mutation of
ExtractedAnswer.metainExtractiveReader._add_answer_page_numberwhen the answer'smetawasNone. Now usesdataclasses.replaceto avoid triggering the dataclass mutation warning. - Fixed
ExtractiveReaderraisingValueErrorwhen the number of valid answer spans for a sequence was smaller thananswers_per_seq(for example with short documents or whenanswers_per_seqexceeded the number of upper-triangular, non-masked (start, end) token pairs)._postprocessnow filters the per-sequence probabilities by the same validity mask it already applied to the start/end token indices, so the three structures always have matching lengths. HierarchicalDocumentSplitterno longer mutates the metadata of the inputDocument._add_meta_datanow returns a newDocumentwith a copiedmetadict viadataclasses.replaceinstead of writing__block_size,__parent_id,__children_idsand__levelonto the caller'sDocument.- Fixed a bug in
LLMMetadataExtractor.run_asyncwhere theasyncio.Semaphoreintended to bound concurrent LLM calls tomax_workerswas acquired once around the outergather(...)call instead of inside each task. As a result,max_workershad no effect inrun_asyncand all LLM requests for a batch were issued simultaneously. The semaphore is now acquired per task, somax_workerscorrectly caps in-flight requests. expand_page_range()now raises aValueError: too many values to unpackwhen a page range string contained more than one hyphen (e.g."10-20-30"). The parser now validates the format and raises a clearValueErrorwith an explanatory message for invalid inputs.LLMMetadataExtractornow raises a clearValueErrorwhen thepromptcontains no template variables. Previously this case raised an unhelpfulIndexError: list index out of range. The error message now consistently expl...
v2.30.0-rc1
v2.30.0-rc1
v2.29.0
⭐️ Highlights
🔍 Combine Retrievers with MultiRetriever and TextEmbeddingRetriever
Two new retriever components make it easier to build hybrid search pipelines. MultiRetriever runs multiple text retrievers in parallel and merges their results into a single deduplicated list, ranked by reciprocal rank fusion by default. You can selectively enable or disable individual retrievers at runtime using the active_retrievers parameter. This is useful when you want to skip the embedding retriever for short or keyword-only queries, for example.
TextEmbeddingRetriever wraps an embedding-based retriever together with a text embedder into a single component, making it compatible with MultiRetriever by implementing the TextRetriever protocol. Here's how to combine BM25 and embedding retrieval in a single component:
from haystack.components.retrievers import MultiRetriever, TextEmbeddingRetriever
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder
retriever = MultiRetriever(
retrievers={
"bm25": InMemoryBM25Retriever(document_store=doc_store),
"embedding": TextEmbeddingRetriever(
retriever=InMemoryEmbeddingRetriever(document_store=doc_store),
text_embedder=SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"),
),
},
top_k=3,
)
# Run all retrievers
result = retriever.run(query="green energy sources")
# Run only the BM25 retriever
result = retriever.run(query="green energy sources", active_retrievers=["bm25"])⬆️ Upgrade Notes
-
LLM.runandLLM.run_asyncno longer acceptmessagesandstreaming_callbackas positional arguments — they must now be passed as keyword arguments. Update any direct calls accordingly:# Before llm.run([message], my_callback) # After llm.run(messages=[message], streaming_callback=my_callback)
🚀 New Features
- Add
run_asynctoCacheChecker, enabling it to be used inAsyncPipelinewithout blocking the event loop.
⚡️ Enhancement Notes
- Document the input ordering behavior of auto-promoted lazy variadic sockets in
Pipeline.connect(). When multiple senders are connected to the same list-typed receiver socket, ordering depends on the pipeline class. WithPipeline, items are ordered alphabetically by sender component name (becausePipeline.run()schedules components in alphabetical order for deterministic execution), not by the order ofconnect()calls. WithAsyncPipeline, no ordering is guaranteed, since components in different branches may run in parallel. The docstrings now point users to a dedicated joiner component when they need explicit ordering. - Add
join_modeparameter to the experimentalMultiRetrievercomponent, supporting"reciprocal_rank_fusion"(default) and"concatenate". Reciprocal Rank Fusion merges the ranked result lists from all retrievers into a single deduplicated list ordered by RRF score. The underlying RRF logic is extracted into a shared utility_reciprocal_rank_fusioninhaystack.utils.misc, which is now also used byDocumentJoiner. LLMnow supports two usage modes:- Template-variable mode: provide a
user_promptwith Jinja2 variables (e.g.{{ query }}).
Those variables become pipeline inputs andmessagesis optional. The rendereduser_prompt
is always appended after anymessagesprovided at runtime. - Pass-through mode: omit
user_promptor provide one with no template variables.messages
becomes a required input, allowing a fully-constructed list ofChatMessages to be passed from upstream.
- Template-variable mode: provide a
🐛 Bug Fixes
- Fixed a bug in
NamedEntityExtractorwhere the spaCy/Thinc device state was not correctly restored after execution, potentially affecting the device configuration of other spaCy components in the same process. - Preserve resumable snapshots when some inputs or outputs are non-serializable. Haystack now omits only the failing top-level fields (for example non-serializable callbacks or runtime objects) instead of replacing the whole payload with an empty dictionary. This applies both to agent sub-component inputs (
chat_generatorandtool_invoker) and to pipeline-levelinputs,original_input_data, andpipeline_outputscaptured by_create_pipeline_snapshot. When every field fails to serialize, the snapshot still stores a structurally valid empty payload ({"serialization_schema": {"type": "object", "properties": {}}, "serialized_data": {}}) so that resuming the snapshot does not raiseDeserializationError— for example when resuming from aToolBreakpointwhere the sub-component's inputs are not strictly required. - Fixed
tools_strict=TrueinOpenAIChatGeneratorto recursively applyadditionalProperties: falseandrequiredto all nested objects in tool parameter schemas. Previously only the top-level object was transformed, causing OpenAI's strict mode to reject tools with nested parameters.
💙 Big thank you to everyone who contributed to this release!
@Aftabbs, @albertodiazdurana, @anakin87, @ArkaD171717, @bilgeyucel, @bogdankostic, @davidsbatista, @FuturMix, @julian-risch, @kacperlukawski, @ritikraj2425, @saivedant169, @shaun0927, @sjrl, @SyedShahmeerAli12
v2.29.0-rc2
v2.29.0-rc2
v2.29.0-rc1
v2.29.0-rc1
v2.28.0
Upgrade Notes
-
As part of the migration from
requeststohttpx,request_with_retryandasync_request_with_retry(inhaystack.utils.requests_utils) no longer raiserequests.exceptions.RequestExceptionon failure; they now raisehttpx.HTTPErrorinstead. This also affectsHuggingFaceTEIRanker, which relies on these utilities. Users catchingrequests.exceptions.RequestExceptionshould update their code to catchhttpx.HTTPError. -
The
LLMcomponent now requiresuser_promptto be provided at initialization and it must contain at least one Jinja2 template variable (e.g.{{ variable_name }}). This ensures the component always exposes at least one required input socket, which is necessary for correct pipeline scheduling.required_variablesnow defaults to"*"(all variables inuser_promptare required), and passing an empty list raises aValueError.If you are affected: update any code that instantiates
LLMwithout auser_prompt, or with auser_promptthat has no template variables, to include at least one variable.Before:
llm = LLM(chat_generator=OpenAIChatGenerator(), system_prompt="You are helpful.")
After:
llm = LLM( chat_generator=OpenAIChatGenerator(), system_prompt="You are helpful.", user_prompt='{% message role="user" %}{{ query }}{% endmessage %}', )
-
Agent.run()andAgent.run_async()now requiremessagesas an explicit argument (no longer optional). If you were relying on the defaultNonevalue in Haystack version 2.26 or 2.27, pass an empty list instead:agent.run(messages=[], ...)
LLM.run()andLLM.run_async()are unaffected — they still acceptNoneand default to an empty list internally.
New Features
-
Tools and components can now declare a
State(orState | None) parameter in their signature to receive the live agentStateobject at invocation time — no extra wiring needed.For function-based tools created with
@toolorcreate_tool_from_function, add astateparameter annotated asState:from haystack.components.agents import State from haystack.tools import tool @tool def my_tool(query: str, state: State) -> str: """Search using context from agent state.""" history = state.get("history") ...
For component-based tools created with
ComponentTool, declare aStateinput socket on the component'srunmethod:from haystack import component from haystack.components.agents import State from haystack.tools import ComponentTool @component class MyComponent: @component.output_types(result=str) def run(self, query: str, state: State) -> dict: history = state.get("history") ... tool = ComponentTool(component=MyComponent())
In both cases
ToolInvokerautomatically injects the runtimeStateobject before calling the tool, andState/Optional[State]parameters are excluded from the LLM-facing schema so the model is not asked to supply them.This is an alternative to the existing
inputs_from_stateandoutputs_to_stateoptions onToolandComponentTool, which map individual state keys to specific tool parameters and outputs declaratively. Injecting the fullStateobject is more flexible and useful when a tool needs to read from or write to multiple keys, but it couples the tool implementation directly toState.
Enhancement Notes
- Clarify in the Markdown-producing converter documentation that
DocumentCleanerwith its default settings can flatten Markdown output, and update the example pipelines forPaddleOCRVLDocumentConverter,MistralOCRDocumentConverter,AzureDocumentIntelligenceConverter, andMarkItDownConverterto avoid routing Markdown content through the default cleaner configuration. - Made
_create_agent_snapshotrobust towards serialization errors. If serializing agent component inputs fails, a warning is logged and an empty dictionary is used as a fallback, preventing the serialization error from masking the real pipeline runtime error. - Standardize HTTP request handling in Haystack by adopting
httpxfor both synchronous and asynchronous requests, replacingrequests. Error reporting for failed requests has also been improved: exceptions now include additional details alongside the reason field. - Add
run_asyncmethod toLLMMetadataExtractor.ChatGeneratorrequests now run concurrently using the existingmax_workersinit parameter. MarkdownHeaderSplitternow accepts aheader_split_levelsparameter (list of integers 1–6, default all levels) to control which header depths create split boundaries. For example,header_split_levels=[1, 2]splits only on#and##headers, merging content under deeper headers into the preceding chunk.MarkdownHeaderSplitternow ignores#lines that appear inside fenced code blocks (triple-backtick or triple-tilde), preventing Python comments and other hash-prefixed lines in code from being misidentified as Markdown headers.- Expand the
PaddleOCRVLDocumentConverterdocumentation with more detailed guidance on advanced parameters, common usage scenarios, and a more realistic configuration example for layout-heavy documents.
Bug Fixes
-
Fix
ToolInvoker._merge_tool_outputssilently appendingNoneto list-typed state when a tool'soutputs_to_statesource key is absent from the tool result. This is a common scenario withPipelineToolwrapping a pipeline that has conditional branches where not all outputs are always produced even if defined inoutputs_to_state. The mapping is now skipped entirely when the source key is not present in the result dict. -
When using the MarkdownHeaderSplitter, in the split chunks, the child header previously lost its direct parent header in the metadata. Previously if one executed the code below:
from haystack.components.preprocessors import MarkdownHeaderSplitter from haystack import Document text = """ # header 1 intro text ## header 1.1 text 1 ## header 1.2 text 2 ### header 1.2.1 text 3 ### header 1.2.2 text 4 """ document = Document(content=text) splitter = MarkdownHeaderSplitter( keep_headers=True, secondary_split="word" ) result = splitter.run(documents=[document])["documents"] for doc in result: print(f"Header: {doc.meta['header']}, parent headers: {doc.meta['parent_headers']}")
We would have expected this output:
Header: header 1, parent headers: [] Header: header 1.1, parent headers: ['header 1'] Header: header 1.2, parent headers: ['header 1'] Header: header 1.2.1, parent headers: ['header 1', 'header 1.2'] Header: header 1.2.2, parent headers: ['header 1', 'header 1.2']But instead we actually got:
Header: header 1, parent headers: [] Header: header 1.1, parent headers: [] Header: header 1.2, parent headers: ['header 1'] Header: header 1.2.1, parent headers: ['header 1'] Header: header 1.2.2, parent headers: ['header 1', 'header 1.2']The error happened when a parent header had its own content chunk before the first child header.
This has been fixed so even when a parent header has its own content chunk before the first child header all content is preserved.
-
Reverts the change that made
Agentmessages optional as it caused issues with pipeline execution. As a consequence, theLLMcomponent now defaults to an empty messages list unless provided at runtime.
💙 Big thank you to everyone who contributed to this release!
@Aftabbs, @Amanbig, @anakin87, @bilgeyucel, @bogdankostic, @davidsbatista, @dina-deifallah, @jimmyzhuu, @julian-risch, @kacperlukawski, @maxdswain, @MechaCritter, @ritikraj2425, @sarahkiener, @sjrl, @soheinze, @srini047, @tholor
v2.28.0-rc2
v2.28.0-rc2