Python

Feb 28, 2026
software-engineering

Data Engineering With Python: Final Thoughts and Takeaways

That’s it. Fifteen chapters, seventeen posts, and one complete walkthrough of Paul Crickard’s Data Engineering with Python (Packt, 2020, ISBN: 978-1-83921-418-9).

Feb 28, 2026
software-engineering

Final Thoughts on Python and R for the Modern Data Scientist

So we made it through the whole book. And honestly? It was worth the ride.

What This Book Got Right

The biggest thing Scavetta and Angelov got right is the framing. They didn’t write a “Python is better” or “R is better” book. They wrote a “both are useful, here’s when to use which” book. And that’s the mature take.

Feb 27, 2026
software-engineering

Python and R Translation Cheat Sheet - Best Equivalents

The appendix of “Python and R for the Modern Data Scientist” is basically a bilingual dictionary. It runs about 40 tables long and covers everything from package management to indexing. You could spend a whole afternoon reading through it.

Feb 27, 2026
software-engineering

Real-Time Edge Data With MiNiFi and Spark - Study Notes From Data Engineering With Python Ch 15

You have NiFi running. Kafka is streaming. Spark is processing. But what about the data source? What happens when your data comes from a tiny sensor or a Raspberry Pi that can barely run a web browser?

Feb 26, 2026
software-engineering

Data Processing With Apache Spark - Study Notes From Data Engineering With Python Ch 14

You have streaming data. You have batch data. You have a lot of it. Now you need to actually process it. Fast. On more than one machine.

Feb 26, 2026
software-engineering

Real World Bilingual Data Science - A Python and R Case Study

The whole book has been building to this. Six chapters of philosophy, syntax comparisons, and interoperability tricks. Now Chapter 7 drops a real project on the table. Build it with both languages. Together. Start to finish.

Feb 25, 2026
software-engineering

Streaming Data With Apache Kafka - Study Notes From Data Engineering With Python Ch 13

Up to this point in the book, data pipelines have been about moving data that already exists. Query a database, read a file, process it, store it. The data sits still and you go get it.

Feb 25, 2026
software-engineering

Using Python and R Together - Tools for Bilingual Data Science

Chapter 6 is where the book finally delivers on its promise. All that talk about using both languages together? This is where it actually happens. Rick Scavetta walks through the nuts and bolts of making Python and R talk to each other in the same project.

Feb 24, 2026
software-engineering

Building a Kafka Cluster - Study Notes From Data Engineering With Python Ch 12

Up to this point in the book, everything has been batch processing. You query a database, get a full dataset, transform it, load it somewhere. The data sits still while you work on it.

Feb 24, 2026
software-engineering

Python vs R Workflows - Machine Learning, Visualization, and More

Chapter 5 is where Boyan Angelov gets practical about the question everyone dances around: which language should you actually use for which job?

Feb 23, 2026
software-engineering

Building a Production Data Pipeline - Study Notes From Data Engineering With Python Ch 11

You learned the individual tools. You learned the deployment strategies. Now Chapter 11 of Data Engineering with Python by Paul Crickard puts it all together. This is the chapter where you build a complete, production-grade data pipeline from start to finish.

Feb 23, 2026
software-engineering

When to Use Python vs R - Data Format Context Explained

Chapter 4 is where the book stops teaching you the languages and starts telling you when to use which one. This is Part III, “The Modern Context,” and Boyan Angelov takes the lead here. The question is simple: given a specific data format, which language gives you a better experience?

Feb 22, 2026
software-engineering

Deploying Data Pipelines - Study Notes From Data Engineering With Python Ch 10

You built your data pipelines. They work on your laptop. Now what? Chapter 10 of Data Engineering with Python by Paul Crickard covers the part everyone eventually has to face: getting your pipelines out of development and into production.

Feb 22, 2026
software-engineering

Python for R Users - Versions, Virtual Environments, and Pandas

Chapter 2 showed Pythonistas how to pick up R. Chapter 3 flips the script. Now it’s the R user’s turn to step into Python territory. Rick Scavetta writes this one, and he does a good job easing R folks into a world that feels messier at first glance.

Feb 21, 2026
software-engineering

Monitoring Data Pipelines - Study Notes From Data Engineering With Python Ch 9

You built a data pipeline. It is idempotent, uses atomic transactions, and has version control. It is production ready. But can you tell when it breaks?

Feb 21, 2026
software-engineering

R for Python Developers - Lists, Factors, and Data Wrangling

In Part 1 we covered R basics: setting up your environment, installing packages, working with tibbles, and understanding R’s type system. Now we get to the good stuff. Lists, factors, finding things in your data, and the iteration patterns that make R feel so different from Python.

Feb 20, 2026
software-engineering

NiFi Registry Version Control - Study Notes From Data Engineering With Python Ch 8

You’ve been building data pipelines for several chapters now. They work. They move data. But here’s the problem: none of them have version control. If you break something, there’s no going back. Chapter 8 of Data Engineering with Python by Paul Crickard fixes that. It introduces the NiFi Registry, a sub-project of Apache NiFi that handles version control for your data pipelines.

Feb 20, 2026
software-engineering

R for Python Developers - Getting Started With RStudio and Tibbles

Chapter 2 is where the book gets hands-on. Rick Scavetta takes the wheel and walks Python developers through R. Not from scratch, but with the assumption you already know how to code. The chapter is big, so I split it into two posts. This is the first half.

Feb 19, 2026
software-engineering

Production Pipeline Features - Study Notes From Data Engineering With Python Ch 7

You built a pipeline. It works on your machine. It runs on a schedule. Data goes in, data comes out. Ship it, right?

Feb 19, 2026
software-engineering

The Origin Stories of Python and R - Chapter 1 Retelling

Chapter 1 is titled “In the Beginning” and it’s written by Rick Scavetta. He opens with a tongue-in-cheek Dickens reference, saying it’s just the best of times for data science. But to understand where we are, we need to look at where Python and R came from. Their origin stories explain why they feel so different today.

Feb 18, 2026
software-engineering

Building a 311 Data Pipeline - Study Notes From Data Engineering With Python Ch 6

The previous chapters taught you the individual tools. Python, NiFi, Airflow, databases, data cleaning. Chapter 6 of Data Engineering with Python by Paul Crickard puts them all together into one real project.

Feb 18, 2026
software-engineering

What Modern Data Science Really Means - Python and R Book Preface

The preface of “Python and R for the Modern Data Scientist” sets up the whole book in a few pages. And it does something rare for a tech book. It actually defines what it means by its own title.

Feb 17, 2026
software-engineering

Book Retelling: Python and R for the Modern Data Scientist

I picked up “Python and R for the Modern Data Scientist” by Rick J. Scavetta and Boyan Angelov a while back. It’s an O’Reilly book from 2021, and it caught my eye because it doesn’t pick sides in the Python vs R debate. Instead, it argues you should use both.

Feb 17, 2026
software-engineering

Cleaning and Transforming Data - Study Notes From Data Engineering With Python Ch 5

You can build the best pipeline in the world. You can read files, write to databases, schedule everything with Airflow. But if the data going through that pipeline is messy, none of it matters.

Feb 16, 2026
software-engineering

Working With Databases - Study Notes From Data Engineering With Python Ch 4

Most data pipelines start with a database. Most of them end with one too. Chapter 4 of Paul Crickard’s book is about connecting Python to databases and moving data between them. If the previous chapter was about flat files, this one is where things get real.

Feb 15, 2026
software-engineering

Reading and Writing Files in Python - Study Notes From Data Engineering With Python Ch 3

Chapter 3 is where Crickard moves from setup to actual work. You installed all those tools in Chapter 2. Now you use them. The chapter covers one of the most fundamental tasks in data engineering: getting data out of text files and into something useful.

Feb 14, 2026
software-engineering

Building Your Data Engineering Setup - Study Notes From Data Engineering With Python Ch 2

Chapter 1 was all theory. Now it’s time to actually install stuff. Chapter 2 of Data Engineering with Python by Paul Crickard is a setup chapter. You install the tools, configure them, and make sure everything talks to each other.

Feb 13, 2026
software-engineering

What Is Data Engineering? Study Notes From Data Engineering With Python Ch 1

Chapter 1 of Data Engineering with Python by Paul Crickard starts with the basics. What is data engineering? What do data engineers actually do? And how is it different from data science?

Feb 12, 2026
software-engineering

Data Engineering With Python: My Study Notes From Paul Crickard's Book

So I picked up Data Engineering with Python by Paul Crickard (Packt, 2020, ISBN: 978-1-83921-418-9) and decided to write up my study notes as I go through it. I’ve been working in IT for over 20 years, and data engineering keeps coming up everywhere. This book seemed like a good one to work through and share what I learn.