Real-Time Streaming with Apache Kafka - Part 2

Mar 19, 2026
Big Data

Architecture is great, but let’s actually run some code. In the second half of Chapter 7, Neylson Crepalde walks us through setting up a multi-node Kafka cluster right on our local machine using Docker Compose.

If you’ve ever tried to install Kafka manually, you know it can be a pain (shoutout to Zookeeper configuration). Docker Compose makes it a breeze.

Spinning up the Cluster

The book provides a docker-compose.yaml that spins up three Kafka brokers and three Zookeeper nodes. This simulates a real production environment.

To get it running, you just need one command:

docker-compose up -d

Once the containers are up, you can jump into one of the brokers to start managing your cluster:

docker exec -it multinode-kafka-1-1 bash

Creating Your First Topic

Inside the container, we use the Kafka CLI tools. First, we create a topic named “mytopic” with 3 partitions and a replication factor of 3 (for high availability):

kafka-topics --create 
    --bootstrap-server localhost:19092 
    --replication-factor 3 
    --partitions 3 
    --topic mytopic

You can verify it by listing the topics:

kafka-topics --list --bootstrap-server localhost:19092

Producing and Consuming

Now for the fun part. You can open two terminal windows to see Kafka in action:

The Producer: Use kafka-console-producer to start typing messages into the topic.
The Consumer: Use kafka-console-consumer to watch those messages appear in real-time.

It’s like a distributed, persistent chat room for your data.

Why this is huge for your stack

This hands-on exercise shows how Kafka becomes the “nervous system” of your data platform. You can have hundreds of different apps (producers) dumping events into Kafka, and hundreds of other apps (consumers) like Spark or custom Python scripts reading that data at their own pace.

Now that we’ve mastered Spark, Airflow, and Kafka individually, it’s time for the ultimate challenge: deploying the entire stack together on Kubernetes.

Next: Deploying the Big Data Stack on Kubernetes - Part 1 Previous: Real-Time Streaming with Apache Kafka - Part 1

Book Details:

Title: Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
Author: Neylson Crepalde
ISBN: 978-1-83546-214-0

#big-data-on-kubernetes #neylson-crepalde #book-retelling #apache-kafka #docker-compose #hands-on #real-time

Real-Time Streaming with Apache Kafka - Part 2

Spinning up the Cluster

Creating Your First Topic

Producing and Consuming

Why this is huge for your stack

About

About BookGrill.net

Category

Tags View all tags

Theme Settings

Accent Color