Building an End-to-End Big Data Pipeline - Part 2
In our last post, we checked the infrastructure. Now, let’s build the actual pipeline. Neylson Crepalde uses the IMDB dataset to demonstrate a professional batch workflow.
In our last post, we checked the infrastructure. Now, let’s build the actual pipeline. Neylson Crepalde uses the IMDB dataset to demonstrate a professional batch workflow.
Chapter 3 is massive. It is basically a catalog of every AWS service a data engineer will touch, from getting data in to getting answers out. So I am splitting it into two posts. This first part covers how data gets into AWS – all the ingestion services, the streaming tools, and the physical devices AWS will literally ship to your door.
Previous: Comparing the Giants: AWS, Azure, and Google Cloud
We’ve talked about the “what” and the “why” of the cloud. Now it’s time for the “how.” Chapter 12 of Sridhar Alla’s book is a deep look at Amazon Web Services (AWS), which is essentially the playground where most big data pros spend their time.