Sentiment Analysis Pipelines and Multi-Agent Architecture Design With Haystack and LangGraph

After NER and text classification, Funderburk moves to the third building block: sentiment analysis. Then she starts putting all the pieces together into a multi-agent architecture. This is where the chapter gets really interesting.

What Is Sentiment Analysis

Sentiment analysis figures out the emotional tone of text. Is this review positive, negative, or neutral? Simple question, surprisingly useful answer.

It works at different levels of detail:

Document-level. The whole review gets one label. “This restaurant was amazing” = positive. Good for quick overviews.

Sentence-level. Each sentence in a document gets its own label. Useful when a review says “The food was great but the service was terrible.” One sentiment doesn’t cover both sentences.

Aspect-based. The most granular. It looks at specific features. In a phone review, “battery life” might be positive while “camera quality” might be negative. This is what businesses actually want when they’re analyzing feedback at scale.

The applications are everywhere: customer feedback analysis, social media monitoring, market research, brand tracking, political analysis. If someone is writing text and you want to know how they feel, sentiment analysis is the tool.

Building the Sentiment Pipeline

Funderburk builds a sentiment analysis pipeline using Haystack’s TransformersTextRouter component with the cardiffnlp/twitter-roberta-base-sentiment model. This is a RoBERTa model fine-tuned specifically for sentiment classification.

The pipeline has two custom components:

YelpReviewFetcher calls the Yelp Business Reviews API through RapidAPI, grabs the reviews, and converts them into Haystack Document objects. Each document contains the review text plus metadata like rating and URL.

BatchSentimentProcessor wraps Haystack’s TransformersTextRouter internally. It loops through each document, classifies the text as LABEL 0 (negative), LABEL 1 (neutral), or LABEL 2 (positive), maps those to human-readable strings, and tucks the sentiment into the document’s metadata.

Here’s how you run it:

pipeline_input = {
    "review_fetcher": {
        "url": yelp_url,
        "headers": api_headers,
        "querystring": {"sortBy": "highestRated"}
    }
}
result = sentiment_pipeline.run(pipeline_input)

The output is a list of Haystack Document objects enriched with sentiment labels. A 1-star review about a tourist trap cheese curd place? The pipeline correctly tags it as “negative.” A glowing 5-star review? Tagged “positive.”

This pipeline, like the NER one, is designed to be serialized into YAML and deployed as a REST endpoint through Hayhooks. That’s the pattern: build it in Python, serialize it, expose it over HTTP. Then any agent can call it.

The Big Project: Yelp Navigator

Now Funderburk puts everything together. The final mini-project is called Yelp Navigator, a multi-agent system that handles complex queries like:

“Find me 3 Mexican restaurants in Austin, Texas, and analyze their customer reviews to tell me which one has the best service.”

No single pipeline can answer this. You need search, detail fetching, sentiment analysis, summarization, and quality assurance. All coordinated automatically.

The key idea: natural language queries vary in how much detail users want. Some people just want restaurant names. Others want reviews. Others want a full comparison report. A rigid pipeline with fixed paths can’t handle that variation. You need flexible routing.

The Architecture: Sequential Routing With Loops

Funderburk designs a sequential routing model where each agent determines the next step. The system has five types of components:

Haystack pipelines (the tools). Three pipelines exposed as REST APIs through Hayhooks:

  • business_search: Takes a query, uses NER to extract entities (business type, location), returns matching businesses
  • business_details: Takes a business list, fetches website content for deeper context
  • business_sentiment: Takes a business list, fetches reviews, runs sentiment analysis

Clarification node. Before doing any work, this node checks if the query has enough information. Does it specify a location? A business type? What level of detail does the user want? It extracts intent before passing to the search agent.

Worker nodes. Specialized agents. Each gets exactly one tool. The search node can only call the business_search endpoint. The details node only calls business_details. The sentiment node only calls business_sentiment. Clean separation.

Summarization node. Takes whatever data the worker nodes collected and generates a user-friendly report.

Supervisor approval. A quality assurance layer. An LLM reviews the final report and decides if it meets requirements. If not, it routes back to the appropriate worker for another pass. Maximum two revision attempts.

Step 1: Expose Pipelines as REST Endpoints

Each pipeline follows a clean folder pattern: components.py (component definitions), build_pipeline.py (build and serialize to YAML), and pipeline_wrapper.py (wrap and load for Hayhooks).

Deployment is two commands:

./build_all_pipelines.sh
uv run hayhooks run --pipelines-dir pipelines

Hayhooks scans the directory, loads every YAML pipeline definition, and creates REST endpoints. business_search becomes http://localhost:1416/business_search/run. You even get automatic Swagger docs at /docs.

Step 2: Wrap Endpoints as LangGraph Tools

This is the bridge between Haystack and LangGraph. You write a regular Python function that makes an HTTP POST to the Hayhooks endpoint and returns the JSON response. Then you slap LangChain’s @tool decorator on it:

@tool
def search_businesses(query: str) -> Dict[str, Any]:
    """Search for businesses using natural language query."""
    response = requests.post(
        f"{BASE_URL}/business_search/run",
        json={"query": query},
        timeout=30
    )
    return response.json()

The decorator automatically generates a JSON schema from the function’s name, docstring, and type hints. This schema tells the LLM agent what the tool does and what inputs it needs. The agent can then “call” the external Haystack pipeline by generating the right parameters.

That’s the whole integration pattern. Haystack builds specialized pipelines. Hayhooks deploys them as REST APIs. LangChain’s @tool decorator makes them callable by a LangGraph agent. Each layer does what it’s best at.

Step 3: Define the Agent State

In LangGraph, state is the shared memory across the entire graph execution. Funderburk defines an AgentState class that extends MessagesState with fields organized into four groups:

User intent data. clarified_query, clarified_location, detail_level (general, detailed, or reviews). Without these, downstream agents don’t know what to search for.

Workflow control. Flags like clarification_complete that gate the transition from clarification to execution.

Agent outputs. A dictionary storing partial results from each worker. Search results, sentiment data, business details. The summary node needs all of this.

Quality control. approval_attempts (capped at 2), needs_revision, and revision_feedback for the supervisor’s review loop.

The state achieves coordination through context preservation (agents don’t forget the original goal), flow control (the system only proceeds when the request is fully understood), and data aggregation (partial results feed into the final answer).

The actual graph construction and execution walkthrough continues in the next post, where we’ll see how the Yelp Navigator handles different queries with different levels of complexity.


This is post 18 of 24 in the Building Natural Language and LLM Pipelines series.

About

About BookGrill.net

BookGrill.net is a technology book review site for developers, engineers, and anyone who builds things with code. We cover books on software engineering, AI and machine learning, cybersecurity, systems design, and the culture of technology.

Know More