Haystack 2.0: RAG as a Tool, Multi-Tool Agents, and the Full Component Catalog
In the first part we covered Haystack 2.0’s core ideas: components, pipelines, SuperComponents, and how hybrid retrieval works. Now let’s look at what happens when you hand these pipelines to an AI agent, plus the full catalog of component types Haystack gives you out of the box.
From Pipeline Builder to Capability Provider
Here’s the thing Funderburk keeps coming back to: Haystack 2.0 changes the developer’s role. You stop being a “pipeline builder” who wires everything for humans. You become a “capability provider” for agents.
What does that mean in practice? You build solid, well-tested Haystack pipelines. Then you wrap them as tools with a name and a description. The agent reads the description, decides when to use the tool, and calls it. You’re not programming every decision path. You’re providing reliable building blocks and letting the LLM figure out when to use each one.
RAG as a Tool
Remember the hybrid RAG pipeline from Part 1? The one with parallel sparse and dense retrieval, fusion, reranking, and generation? You can wrap that entire thing into a single tool:
tool = Tool(
name="internal_knowledge_search",
description="Search our internal database of company documents. "
"Best for products, policies, and historical project data. "
"Input: a clear, specific question.",
pipeline=hybrid_rag_pipeline
)
That multi-step process of parallel retrieval, fusion, and reranking is now a single callable capability. The agent doesn’t need to know how it works inside. It just knows what it does and when to use it.
Multi-Tool Agents
The real power shows up when agents have multiple tools to choose from. Funderburk walks through a nice example. Say you have two tools:
internal_knowledge_searchfor company documentsweb_searchfor live internet searches
Now a user asks: “Summarize our internal Q3 performance report and compare our main product’s features against our top competitor’s latest release.”
The agent’s reasoning loop goes something like:
- Think: This has two parts. Internal report first. The
internal_knowledge_searchtool is best for that. - Act: Call
internal_knowledge_searchwith “Q3 performance report” - Observe: Get the report summary back
- Think: Now I need competitor info. That’s external, recent. The
web_searchtool handles that. - Act: Call
web_searchwith “competitor X latest product release features” - Observe: Get web results
- Think: I have both pieces now. Time to synthesize.
- Answer: Generate a comparison combining internal and external data
Each tool call hits a validated, type-safe Haystack pipeline. The agent handles the reasoning. The pipelines handle the reliable execution.
The Trade-off
Funderburk is honest about the drawback. Haystack’s agent component encapsulates the reasoning loop internally. It’s kind of a black box. If you need fine-grained control over state, memory curation, or complex multi-step orchestration, frameworks like LangGraph give you more visibility. That’s actually the whole point of the book’s hybrid architecture: use Haystack for the tool layer, LangGraph for the orchestration layer.
The Full Component Catalog
The second half of Chapter 3 is basically a guided tour of everything Haystack gives you. Funderburk organizes them into categories.
Data Classes
Before components even run, you need to understand what flows between them. Haystack defines a few core data types:
- Document: the fundamental unit. Holds text content or binary data, plus metadata, embeddings, and relevance scores
- ByteStream: raw binary data like images or PDFs before conversion to text
- ChatMessage: structured chat messages with roles (user, assistant, system, tool) and support for images and tool calls
- StreamingChunk: for real-time streaming responses, one token at a time
- Answer: comes in two flavors.
GeneratedAnswerfrom an LLM,ExtractedAnswerpulled directly from a source document
Document Stores
This is where your documents live for retrieval. Haystack separates the storage layer from the access layer. You configure a DocumentStore outside the pipeline, then use retriever components inside the pipeline to fetch from it.
Options include in-memory stores for prototyping, vector databases like Pinecone, Weaviate, Qdrant, Milvus, and Chroma, traditional search engines like Elasticsearch and OpenSearch, and other databases like MongoDB Atlas, AstraDB (Cassandra), and Neo4j for graph data.
Components for Data Preprocessing
These handle the ingestion phase. File converters for PDFs, HTML, Markdown, and more. A web component (LinkContentFetcher) for grabbing content from URLs. Preprocessing components for cleaning text, normalizing whitespace, removing headers and footers, and splitting documents into chunks. Audio components for transcription.
Components for Data Embedding
Embedders turn text into vectors. Haystack provides separate components for embedding documents (indexing time) and queries (search time), because some models work differently for each task. Supported providers include OpenAI, Sentence Transformers from Hugging Face, Cohere, and others. After embedding, DocumentWriter saves everything to the document store.
Components for Data Retrieval
Retrievers fetch relevant documents based on a query. They support both embedding-based search and keyword-based BM25 search. Rankers then reorder the results for better accuracy. These are the heart of any RAG pipeline.
Components for LLM Generation
Generators handle API calls to LLM providers: OpenAI, Hugging Face TGI, Anthropic, and others. Builders help you craft prompts. PromptBuilder uses templates with variable substitution. ChatPromptBuilder does the same for chat applications. AnswerBuilder parses structured answers from raw LLM output.
Components for Routing
Routers send data down different paths based on conditions. Joiners merge data from parallel branches. These are the “plumbing” that makes non-linear pipelines possible. Without them, you can’t do things like hybrid retrieval.
Agentic Components
The Agent component is the core reasoning engine. Give it tools and a query, and it orchestrates everything. ToolInvoker executes the tool calls that the agent decides on.
Getting Started: A Practical Roadmap
Funderburk ends the chapter with practical advice for adopting Haystack. She recommends an incremental approach:
- Build the indexing pipeline first: get your data ingested, cleaned, embedded, and stored
- Build a simple semantic RAG pipeline: embed the query, retrieve by vector similarity, generate an answer. This is your baseline
- Upgrade to hybrid RAG: add a parallel BM25 branch, join results, add a reranker
- Evaluate: use a framework like Ragas to compare your naive vs hybrid approach with real test queries
- Wrap as agent tools: package your pipelines as tools, validate locally with a Haystack agent, then deploy as REST microservices via Hayhooks for production
She also covers common issues: data ingestion errors from bad file paths or chunk sizes, PipelineConnectError from mismatched socket types (use .draw() to debug), performance bottlenecks (call warm_up() on heavy components before serving), and unexpected agent behavior (inspect the reasoning trace and refine tool descriptions).
My Take
This chapter gives you a solid mental model of Haystack 2.0 without drowning you in code. The hierarchy of Component to Pipeline to SuperComponent to Tool to Agent is clean and makes sense. The honest discussion of trade-offs with LangGraph is refreshing. And the practical roadmap at the end is exactly what someone new to the framework needs.
The next chapter is where we actually build these pipelines with real code. That’s where it gets fun.
This is post 8 of 24 in the Building Natural Language and LLM Pipelines series.