Data Engineering with AWS: A Book Retelling Series for the Cloud-Curious
Every company today is drowning in data. Clicks, transactions, sensor readings, log files, social media posts. It just keeps coming. But raw data sitting in a pile is useless. The real magic happens when someone builds the pipes that move it, clean it, reshape it, and deliver it to the people who need it.
That someone is a data engineer. And this book teaches you how to become one using Amazon Web Services.
I am going to retell “Data Engineering with AWS” by Gareth Eagar, chapter by chapter, in a series of blog posts. If you ever wondered how companies like Netflix or Uber handle massive amounts of data in the cloud, this series is for you.
What This Series Is About
Over the next several weeks, I will walk you through all 14 chapters of a very practical, hands-on book about building data pipelines on AWS. Not theory-heavy academic stuff. Real, working pipelines that take raw data and turn it into something useful.
Think of it like plumbing for data. You have data coming from all kinds of sources: databases, files, streaming events, APIs. You need to get that data somewhere safe, clean it up, transform it into the right shape, and then serve it to analysts, dashboards, and machine learning models. AWS gives you dozens of tools to do this. The book shows you which ones to pick and how to connect them together.
I will break down every chapter into simple, digestible posts. No prior AWS experience or data engineering background needed. If you know what a database is and you are not scared of the word “cloud,” you are good to go.
Why This Book Matters
Here is why I picked this book to retell:
- Data engineering is one of the hottest careers in tech right now. Every company needs people who can build and maintain data pipelines. The demand is massive and keeps growing.
- AWS dominates the cloud market. It holds the largest share of cloud infrastructure worldwide. Learning data engineering on AWS means learning on the platform most companies actually use.
- The author knows his stuff. Gareth Eagar worked at AWS as a solution architect, then became a senior data architect. He did not just read about this. He built it, deployed it, and helped other companies do the same.
- It is practical, not theoretical. The book includes real code examples, architecture diagrams, and step-by-step walkthroughs. You build things while you learn.
Whether you are a developer looking to move into data engineering, a data analyst wanting to understand what happens before data hits your dashboard, or just someone curious about how cloud data pipelines work, this series will give you a solid foundation.
What to Expect
I plan to cover all 14 chapters across roughly 21 posts, published weekly. Here is the roadmap broken into five sections:
Foundations (Chapters 1-2)
What data engineering actually is, who data engineers are, and how modern data architectures work. We will cover data lakes, data warehouses, lakehouses, and why the field evolved the way it did. This is your base. Everything else builds on it.
The AWS Toolkit (Chapters 3-4)
A tour of all the AWS services you need to know for data engineering. S3, Glue, Redshift, Kinesis, Athena, Lake Formation, and more. Plus the critical but often boring topic of data security and governance. Yes, we will make it not boring.
Building Pipelines (Chapters 5-7)
This is where things get hands-on. How to design a data pipeline from scratch, how to ingest data from different sources (batch and streaming), and how to transform raw data into clean, usable datasets. This section is the heart of the book.
Consuming Data (Chapters 8-10)
Who actually uses all this data and how? We will cover data consumers, building data marts with Amazon Redshift, and orchestrating complex pipelines so everything runs smoothly without someone pressing buttons at 3 AM.
Analytics and AI (Chapters 11-14)
Querying data with Amazon Athena, building visualizations with QuickSight, feeding data into machine learning models, and looking at where data engineering is heading next. This is where all the pipeline work pays off.
Some posts will include code examples where they help explain a concept. I will keep them simple and explain what every piece does.
About the Book
Title: Data Engineering with AWS
Author: Gareth Eagar
Publisher: Packt Publishing, 2021
ISBN: 978-1-80056-041-3
Gareth Eagar is a senior data architect who spent years at Amazon Web Services helping organizations design and build their data infrastructure. He wrote this book to give practical, real-world guidance to anyone who wants to learn data engineering on AWS. It is not a reference manual. It is a guided journey from zero to building real data pipelines.
Let Us Get Started
Data engineering is not just for big tech companies anymore. Small startups, government agencies, healthcare providers, retailers – everyone needs to move and transform data at scale. The cloud made it possible without buying a room full of servers, and AWS made the tools accessible.
This series will walk you through it all, one chapter at a time. No jargon walls. No assumed knowledge. Just the concepts, the tools, and how they fit together.
Let us get into it.
Next: Chapter 1 - An Introduction to Data Engineering
Book: Data Engineering with AWS by Gareth Eagar | ISBN: 978-1-80056-041-3