Home » Books » Building Natural Language and LLM Pipelines

Building Natural Language and LLM Pipelines

Laura Funderburk's hands-on guide to building production-grade NLP and LLM pipelines with Haystack and LangGraph, covering RAG, tool contracts, context engineering, and agentic AI architecture.

Building Natural Language and LLM Pipelines is a practical engineering book that teaches you how to move from experimental LLM scripts to production-ready AI systems. The central idea is simple but powerful: separate your AI architecture into a tool layer (built with Haystack) and an orchestration layer (built with LangGraph). This separation makes systems testable, debuggable, and reliable.

The book starts with NLP pipeline fundamentals and how large language models actually work, including the concept of context engineering as a formal discipline. It then walks through Haystack’s component system, showing how to build indexing pipelines, RAG systems, hybrid retrieval, and custom components with strict data contracts. The production chapters cover Docker, evaluation with RAGAS, observability with Weights and Biases, and deployment via FastAPI and Hayhooks.

The highlight is Chapter 8’s Yelp Navigator project, a multi-agent system that combines Haystack microservices for NER, sentiment analysis, and text classification with LangGraph orchestration for routing, state management, and supervisor approval. It shows the tool-vs-orchestration pattern working end to end.

The book closes with a look at emerging trends like NVIDIA NIMs, Model Context Protocol (MCP), and Agent-to-Agent (A2A) protocols, plus a final analysis of agentic architecture trade-offs including token economics and failure resilience. This is for Python developers, NLP engineers, and technical leads who want to build AI systems that actually work in production.

Feb 05, 2026
software-engineering

Building NLP and LLM Pipelines With Haystack - Book Retelling Series

So I just finished reading Building Natural Language and LLM Pipelines by Laura Funderburk, and I wanted to share what I learned. This is one of those books that bridges the gap between “I can make a ChatGPT wrapper” and “I can build production AI systems that actually work.”

Feb 06, 2026
software-engineering

NLP Pipeline Fundamentals: Data Pipelines, the Agentic Reliability Crisis, and Why Classic NLP Still Matters

Chapter 1 of Laura Funderburk’s book opens with something I wish more people in the AI space would say out loud: the era of pure experimentation with LLMs is over. We’re past the “look what ChatGPT can do” stage. The real question now is: can you trust this thing in production?

Feb 07, 2026
software-engineering

NLP Pipeline Fundamentals Part 2: Tokenization, Embeddings, LLM Roles, and the Road to Agentic Pipelines

In Part 1 we covered the agentic reliability crisis, what data pipelines are, and why classic NLP techniques are being reborn as tools for AI agents. Now let’s get into the specifics: how tokenization and embeddings actually work, what LLMs are, and the two very different roles they play in modern agentic systems.

Feb 08, 2026
software-engineering

Transformers, Attention, and the Evolution of LLMs - Chapter 2 Part 1

Chapter 2 of Laura Funderburk’s book opens with the big picture of large language models. Where they came from, how they work inside, and where they are heading. If Chapter 1 was about pipelines, this chapter is about the models that sit at the center of those pipelines.

Feb 09, 2026
software-engineering

Context Engineering, Prompt Strategies, and Framework Wars - Chapter 2 Part 2

In Part 1, we covered how transformers work and how models split into small language models (SLMs) and reasoning language models (RLMs). Now Funderburk shifts to a big question: how do you actually interact with these models in a reliable way?

Feb 10, 2026
software-engineering

Vector Stores, Agentic Memory, and the Economics of LLMs - Chapter 2 Part 3

Parts 1 and 2 of this chapter covered transformer architecture, the SLM/RLM split, context engineering strategies, and the Haystack + LangGraph hybrid architecture. Now Funderburk closes the chapter with two topics that every developer building LLM applications needs to understand: vector stores and the economics of inference.

Feb 11, 2026
software-engineering

Haystack 2.0 by Deepset: Components, Pipelines, Document Stores, and Retrievers

Chapter 3 of Laura Funderburk’s book is where the rubber meets the road. We stop talking theory and start looking at an actual framework you can use to build real NLP pipelines. That framework is Haystack 2.0 by a company called deepset.

Feb 12, 2026
software-engineering

Haystack 2.0: RAG as a Tool, Multi-Tool Agents, and the Full Component Catalog

In the first part we covered Haystack 2.0’s core ideas: components, pipelines, SuperComponents, and how hybrid retrieval works. Now let’s look at what happens when you hand these pipelines to an AI agent, plus the full catalog of component types Haystack gives you out of the box.

Feb 13, 2026
software-engineering

Haystack Pipelines: Indexing, Multimodal Processing, and Your First RAG System

Chapter 4 is where you stop reading about components and actually start wiring them together. Laura Funderburk calls it “Bringing Components Together,” and that’s exactly what it is. You take all those building blocks from Chapter 3 and connect them into working pipelines.

Feb 14, 2026
software-engineering

Hybrid RAG: Parallel Retrieval, Fusion, Reranking, and Multimodal Pipelines

In the last post we built a naive RAG pipeline. It works, but it has a blind spot: it only understands meaning, not exact words. Search for error code “ERR-4052” and the semantic retriever might miss the one document that contains that exact string. This is the vocabulary mismatch problem, and hybrid RAG is how you fix it.

Feb 15, 2026
software-engineering

Custom Haystack Components: The @Component Decorator, Input/Output Contracts, and Warm_up

Chapter 5 is where Funderburk says: stop being a user of Haystack. Start being an architect. Up until now, the book has been about plugging together existing components. Now you learn to build your own.

Feb 16, 2026
software-engineering

Knowledge Graphs, Synthetic Test Data, and Multi-Source Pipelines in Haystack

In the last post we learned the rules for building custom Haystack components. Now Funderburk puts those rules to work on a real problem: building a pipeline that creates a knowledge graph from your documents and then generates synthetic test questions from that graph.

Feb 17, 2026
software-engineering

From Jupyter Notebooks to Production RAG: Docker, Uv, SuperComponents, and Why Project Structure Matters

Chapter 6 is where the book shifts gears. Hard. Funderburk basically says: “Cool, you built a RAG pipeline. It works on your laptop. Now what?”

Feb 18, 2026
software-engineering

Measuring RAG Quality With RAGAS and Weights & Biases: Evaluation, Observability, and Cost-Performance Tradeoffs

In Part 1, we covered how Funderburk moves from Jupyter notebooks to a production-ready project structure. Docker, uv, SuperComponents, dual Elasticsearch. Now comes the part that actually tells you if your RAG pipeline is any good: systematic evaluation with RAGAS and continuous monitoring with Weights and Biases.

Feb 19, 2026
software-engineering

FastAPI, Docker, and Securing Your NLP Endpoints - Chapter 7 Part 1

Chapter 7 of Laura Funderburk’s book is where the rubber meets the road. You built a RAG pipeline in Chapter 6. Now you need to ship it. Get it out of a notebook and into something that real users can hit with HTTP requests.

Feb 20, 2026
software-engineering

CI/CD, Pipeline Serialization, and Hayhooks for Zero-Boilerplate Deployment - Chapter 7 Part 2

In Part 1 we built a FastAPI app, Dockerized it, and locked it down with API keys. That is the “maximum control” path. It works great, but it requires a lot of boilerplate. Part 2 covers two things: automating the whole thing with CI/CD, and a completely different approach that makes most of that boilerplate disappear.

Feb 21, 2026
software-engineering

Hands-on NER Pipelines and Text Classification With Haystack: From Monolithic to Tool-Based Architecture

Chapter 8 is where Funderburk says: enough with single pipelines. Time to build tools. And then make an agent pick which tool to use.

Feb 22, 2026
software-engineering

Sentiment Analysis Pipelines and Multi-Agent Architecture Design With Haystack and LangGraph

After NER and text classification, Funderburk moves to the third building block: sentiment analysis. Then she starts putting all the pieces together into a multi-agent architecture. This is where the chapter gets really interesting.

Feb 23, 2026
software-engineering

Building the Yelp Navigator: Multi-Agent Orchestration With LangGraph, Haystack Microservices, and Supervisor Approval

This is where everything from Chapter 8 comes together. We’ve built NER pipelines, text classification tools, sentiment analyzers. Now Funderburk wires them into a multi-agent graph that can handle complex queries end to end.

Feb 24, 2026
software-engineering

Hardware Limits, NVIDIA NIMs, Edge Deployment, and Why LLMs Still Struggle

Chapter 9 of Laura Funderburk’s book takes a step back from building things and looks forward. What’s coming next for NLP and LLM systems? Where are the bottlenecks? What’s changing?

Feb 25, 2026
software-engineering

MCP, A2A Protocol, Agentic Context Engineering, and the Future of AI Interoperability

In the first half of Chapter 9, Funderburk covered hardware limitations and the four big problems with LLMs. Now she gets to the good stuff: the protocols and frameworks that are actually solving those problems.

Feb 26, 2026
software-engineering

Agentic AI Architecture: From Monolithic Scripts to Resilient Supervisors

The epilogue of Funderburk’s book is where everything clicks together. All the individual skills from earlier chapters (pipelines, RAG, tool contracts, Haystack components, LangGraph orchestration) get assembled into a single architectural argument. And that argument is surprisingly clear: separate the doing from the thinking.

Feb 27, 2026
software-engineering

Token Economics, System Integrity Under Failure, and the Sovereign Agent Stack

In the previous post, we looked at how agentic architectures evolved from brittle sequential chains (V1) through router patterns (V2) to resilient supervisors (V3). Now Funderburk puts those architectures under stress. The results are honestly a little scary.

Feb 28, 2026
software-engineering

Building NLP and LLM Pipelines - Final Thoughts on the Book

And that’s a wrap. Over the past 24 days, we walked through every chapter of Laura Funderburk’s Building Natural Language and LLM Pipelines. Here are my final thoughts on the book as a whole.