Building the Yelp Navigator: Multi-Agent Orchestration With LangGraph, Haystack Microservices, and Supervisor Approval
This is where everything from Chapter 8 comes together. We’ve built NER pipelines, text classification tools, sentiment analyzers. Now Funderburk wires them into a multi-agent graph that can handle complex queries end to end.
Building the Graph: Nodes and Edges
The Yelp Navigator uses LangGraph’s StateGraph to define the workflow. You add nodes (functions that process and modify state) and connect them with edges (which determine the flow).
Four types of nodes make up the system:
Clarification node. Takes the user’s raw query, figures out what they actually want, extracts location and business type, and determines the detail level. Think of it as the receptionist who asks the right questions before sending you to the right department.
Worker nodes. Three specialists, each with exactly one tool. The search node calls the business_search Hayhooks endpoint. The details node calls business_details. The sentiment node calls business_sentiment. No worker can access tools outside its lane.
Summary node. Collects all the data from worker nodes and generates a comprehensive report with business names, phone numbers, websites, review analysis. This is the writer.
Supervisor approval node. The quality checker. It doesn’t use tools. It reads the final summary, checks whether it actually answers the user’s question, and decides: approve and finish, or send it back for revision.
Here’s how the graph gets built:
workflow = StateGraph(AgentState)
# Add all nodes
workflow.add_node("clarification", clarification_wrapper)
workflow.add_node("search", search_wrapper)
workflow.add_node("details", details_wrapper)
workflow.add_node("sentiment", sentiment_wrapper)
workflow.add_node("summary", summary_wrapper)
workflow.add_node("supervisor_approval", supervisor_approval_wrapper)
Conditional Edges: Where the Magic Happens
The interesting part isn’t the nodes. It’s the edges. LangGraph supports conditional edges, where the next node gets chosen dynamically based on the current state.
The graph starts at clarification. Once clarification has enough info, it moves to search:
workflow.add_edge(START, "clarification")
workflow.add_conditional_edges(
"clarification",
route_after_clarification,
{"clarification": "clarification", "search": "search"}
)
Notice the clarification can loop back to itself. If the query is too vague, it keeps asking until it has what it needs.
After search, routing depends on the detail level. Need more info? Go to details. Just want a quick answer? Skip to summary:
workflow.add_conditional_edges(
"search",
route_after_search,
{"details": "details", "summary": "summary"}
)
After details, the same pattern: need review analysis? Go to sentiment. Otherwise, summary:
workflow.add_conditional_edges(
"details",
route_after_details,
{"sentiment": "sentiment", "summary": "summary"}
)
Sentiment always flows to summary. Summary always flows to supervisor approval:
workflow.add_edge("sentiment", "summary")
workflow.add_edge("summary", "supervisor_approval")
And the supervisor? It can route anywhere. Back to search, details, sentiment, summary, or to END if the report passes quality review:
workflow.add_conditional_edges(
"supervisor_approval",
route_from_supervisor_approval,
{
"search": "search",
"details": "details",
"sentiment": "sentiment",
"summary": "summary",
END: END
}
)
This is why LangGraph exists. Try building this kind of dynamic routing with Haystack’s pipeline-first approach and you’ll end up with unmaintainable spaghetti. With LangGraph, every routing decision is an explicit Python function that reads the state and returns a string.
Complex Query: Coffee Shops in Portland
When you run the query “Find coffee shops in Portland and check their reviews,” the graph executes this sequence:
- Clarification identifies query=“coffee shops”, location=“Portland”, detail_level=“reviews”. Routes to search.
- Search calls Pipeline 1 (business_search) to find coffee shops. Sees detail_level is “reviews,” routes to details.
- Details calls Pipeline 2 (business_details) to fetch website info. Still needs reviews, routes to sentiment.
- Sentiment calls Pipeline 3 (business_sentiment) to analyze reviews. Routes to summary.
- Summary aggregates everything into a report. Routes to supervisor.
- Supervisor reviews the report quality. Either approves (END) or requests revision (max 2 attempts).
Simple Query: Best Sushi in California
For a simpler query like “best places for sushi in California,” the graph takes a shorter path:
- Clarification extracts query=“sushi”, location=“California”, detail_level=“general”
- Search calls Pipeline 1. Checks detail_level is “general,” skips details and sentiment, routes to summary
- Summary aggregates search results into recommendations. Routes to supervisor.
- Supervisor reviews and approves.
Same graph, different paths. The routing logic handles the variation. No need to build separate pipelines for different query types.
Debugging Multi-Agent Systems
Funderburk raises an important point about failures. When a multi-agent system breaks, figuring out what went wrong is hard. Was it the LLM making a bad decision? A tool timing out? An API returning an error?
She identifies three common failure modes:
LLM decision failure. The sequence stops at a reasoning node because the LLM failed to generate a valid tool call. The routing function didn’t get a clean “next_agent” value.
Node timeout. A specific component (like web search) exceeds its time limit. You can identify this by monitoring which node was active when the crash happened.
Tool execution error. The node runs correctly, but the underlying Python function or API throws an exception. The HTTP call to the Hayhooks endpoint returned a 500 error, or the Yelp API hit its rate limit.
An external monitoring layer that records the node sequence and error logs lets you distinguish between these. The book’s epilogue promises a deeper look at advanced observability and recovery strategies.
The Hybrid Architecture Takeaway
The chapter ends with a clear architectural conclusion. The ideal production system is a hybrid that separates two layers:
The tool layer (Haystack + Hayhooks). Build specialized pipelines using Haystack’s modular architecture. Serialize them. Deploy them as independent, stateless microservices via Hayhooks. Scale them with Docker and Kubernetes. Your NER tool, classifier tool, and sentiment tool are NOT Python objects imported by the agent. They are independent services with their own REST endpoints.
The orchestration layer (LangGraph). A central StateGraph serves as the agentic brain. Its tool nodes are simple HTTP clients that call the Hayhooks endpoints. The graph handles complex control flow, cyclical logic, supervisor patterns, and observable state management. It doesn’t need to know how NER or sentiment analysis works internally. It just knows when to call those endpoints.
This separation means you can develop, test, deploy, and scale the tools independently from the agent logic. Swap out a sentiment model without touching the orchestrator. Add a new pipeline as a new worker node with minimal friction.
Funderburk’s Chapter 8 takes you from “I can build a single pipeline” to “I can build a system of cooperating agents that use specialized tools.” That’s a serious jump. And the pattern she lays out (Haystack for tools, Hayhooks for deployment, LangGraph for orchestration) is a practical architecture you can actually use in production.
This is post 19 of 24 in the Building Natural Language and LLM Pipelines series.