Data Engineering with Python by Paul Crickard

Paul Crickard's hands-on guide to building data pipelines with Python, covering ETL, NiFi, Airflow, Kafka, Spark, and production deployment.

Data Engineering with Python walks you through the full data engineering stack using Python as the glue language. The book starts with fundamentals like reading files and working with databases, then progresses to building complete data pipelines with Apache NiFi and Apache Airflow. It covers data cleaning with pandas, monitoring with Elasticsearch and Kibana, and deployment strategies for production environments.

The second half shifts to streaming and big data. You set up a Kafka cluster, build streaming pipelines, process data with Apache Spark, and finish with a real-time edge computing project using MiNiFi. Three dedicated project chapters tie everything together with practical, end-to-end pipelines.

This book is for Python developers who want to understand data engineering from the ground up. It’s especially useful for beginners who want breadth across the tooling landscape rather than deep expertise in any single tool. The hands-on approach means you’re building real pipelines, not just reading about theory.

Published by Packt in 2020, some tooling details have aged (manual installs vs Docker, older Airflow patterns, ZooKeeper-based Kafka), but the core patterns of ETL, staging, validation, idempotency, and monitoring remain relevant and well-taught.

NiFi Registry Version Control - Study Notes From Data Engineering With Python Ch 8

You’ve been building data pipelines for several chapters now. They work. They move data. But here’s the problem: none of them have version control. If you break something, there’s no going back. Chapter 8 of Data Engineering with Python by Paul Crickard fixes that. It introduces the NiFi Registry, a sub-project of Apache NiFi that handles version control for your data pipelines.

About

About BookGrill.net

BookGrill.net is a technology book review site for developers, engineers, and anyone who builds things with code. We cover books on software engineering, AI and machine learning, cybersecurity, systems design, and the culture of technology.

Know More