Adi Wijaya's practical guide to building scalable data pipelines and platforms using Google Cloud Platform services like BigQuery, Dataproc, Dataflow, and Cloud Composer.
Data Engineering with Google Cloud Platform (2nd edition, 2024) walks you through the full stack of data engineering on GCP. It starts with fundamentals like ETL, data warehouses, and data lakes, then moves to hands-on building with BigQuery, Cloud Composer (Airflow), Dataproc (Spark), Pub/Sub, Dataflow (Beam), Looker Studio, and Vertex AI.
The book is split into three parts. Part one covers data engineering basics and the GCP ecosystem. Part two is the bulk of the book, where you build actual pipelines, data warehouses, data lakes, streaming systems, visualizations, and ML workflows. Part three tackles the strategic side: project management, data governance, cost control, CI/CD, and career growth including GCP certification prep.
Written by a cloud data engineer at Google with over a decade of experience, the book targets aspiring data engineers, people preparing for the GCP Professional Data Engineer certification, and teams migrating data workloads to Google Cloud. The second edition updates coverage to include Dataform, Dataproc Serverless, BigQuery editions pricing, and Vertex AI pipelines.