Data Engineering with GCP Chapter 13 Part 2: GCP Certifications and Career Next Steps
In Part 1 we went through the quiz questions, extra GCP services, and how the book ties everything together. Now let’s talk about the stuff that matters after you close the book: getting certified, planning your career, and figuring out what comes next.
The Google Cloud Professional Data Engineer Certification
The author is pretty direct about this: take the certification exam. Not just for the piece of paper, but for the learning that happens while you prepare.
Google Cloud certifications come in three tiers. Foundational is for anyone, no hands-on experience needed. Associate is for people with 6+ months of Google Cloud experience. Professional is for 1+ years on Google Cloud with 3+ years of industry experience. For data engineering specifically, there is only one certification and it sits at the Professional level. It is called the Google Cloud Professional Data Engineer.
The author passed this exam three times (2019, 2021, and 2024), so his advice comes from actual experience, not just theory.
There are also adjacent certifications worth considering if you want to expand: Cloud Architect, Cloud DevOps Engineer, and Machine Learning Engineer. They are not core data engineering, but they overlap enough to be useful.
What the Exam Actually Covers
The Professional Data Engineer exam has five sections:
- Designing data processing systems
- Ingesting and processing data
- Storing data
- Preparing and using data for analysis
- Maintaining and automating data workloads
Here is the honest part. The book covers a lot, but the exam is broader. The book walks you through 19 GCP services hands-on: BigQuery, Cloud Storage, Cloud Composer, Dataproc, Dataflow, Pub/Sub, Cloud SQL, Dataplex, Dataform, DLP/Sensitive Data Protection, Looker Studio, Vertex AI, IAM, and several others.
But the exam also asks about services the book does not cover in depth: Looker, Cloud Data Fusion, Bigtable, Spanner, Datastore, Memorystore, Data Transfer Service, Dataprep, Cloud Logging, Cloud Monitoring, and Analytics Hub. You need to know those too, at least at a high level.
Exam Prep Tips That Actually Help
The author says to prepare even if you already have production experience. The exam tests edge cases. You might use Cloud Storage every day and still get tripped up by a question about which storage class to pick for data accessed less than once per quarter. (The answer is Coldline, by the way.)
Here is what I’d pull out from the book’s advice:
Know the decision trees. When to use Bigtable vs. Spanner vs. Datastore vs. Cloud SQL. When to use Dataflow vs. Dataproc. When Pub/Sub fits and when it does not. The exam loves “which service should you pick” questions.
Pay attention to keywords in questions. Words like “minimum effort,” “quickly and easily,” and “real-time” are clues pointing you toward specific services.
Know the gaps. If you followed the book’s exercises, you already have good hands-on knowledge of the main services. But spend extra time on the ones the book did not cover deeply. Bigtable in particular comes up a lot. Know about row key design, hotspots, HDD vs. SSD, and that compute is separated from storage so you do not lose data if a node goes down.
Use official resources. The Google Cloud certification page at cloud.google.com/certification/data-engineer is your best starting point. The exam guide there lists all the subtopics.
Practice with sample questions. The book gives 12 example questions across all five sections. Get familiar with the format. Each question has four options and usually one correct answer, though the wrong answers are designed to sound reasonable.
The Past, Present, and Future of Data Engineering
This is the part of the chapter where the author steps back from GCP and talks about the profession itself. I found it genuinely useful.
In the past, data engineers were mostly ETL developers working with proprietary tools on-premises. The job title barely existed. People were called data modelers, database admins, or just ETL developers (sometimes named after the specific tool they used).
In the present, data engineering has become a mature, recognized role. “Big data” and “the cloud” are no longer future concepts. They are just how things work now. The author cites a 2020 report showing data engineering interviews increased by 40% year over year. And that trend has only continued.
Looking at the future, the author sees two things happening:
Technology-wise, cloud adoption is still growing. Traditional organizations like banks and government agencies are still migrating. Data security, governance, and multi-cloud capabilities will keep maturing. And the generative AI wave is raising awareness of data engineering the same way the 2014 machine learning hype raised awareness of big data. More attention on AI means more investment in data foundations, which means more demand for data engineers.
Role-wise, the “data engineer” title will probably fragment into more specialized roles. The analytics engineer role (popularized by dbt) is already an example. Non-engineers will become more SQL-literate, taking over much of the transformation work. Data engineers will shift toward designing foundations: architecture, governance, security, and the extract-and-load parts of ELT. The transformation part will increasingly belong to business teams who understand the domain.
Building Confidence for the Road Ahead
The last section of the book is not technical at all. It is about confidence.
The author defines confidence using the American Psychological Association’s version: “trust in one’s abilities, capacities, and judgment.” Then he maps it to your situation as a data engineer.
Abilities come from what you learned. If you followed the book, you have hands-on experience with 19 GCP services. That is a real skill set.
Capacities come from experience, and the author is honest here. If you are just starting, the landscape looks overwhelming. His advice: start small. Focus on one tool, one area. Many real jobs only need you to know BigQuery, or just Hadoop/Dataproc, or just one slice of the stack. Start there and grow.
Judgment comes from fundamentals. When you are stuck, go back to core principles. Least privilege. Data modeling. ETL vs. ELT. Batch vs. streaming. The fundamentals do not change as fast as the tools do.
What Now?
If you have read through this entire book (or this retelling series), you have a solid foundation. Here is what I would suggest as next steps:
Build something real outside of the book’s exercises. Pick a personal project or a work problem and apply what you learned. That is where the knowledge turns into experience.
Start preparing for the Professional Data Engineer exam if you have not already. Use the book as one resource, but supplement it with official documentation and practice exams.
Stay current. GCP adds and changes services regularly. The official Google Cloud blog and release notes are worth checking periodically.
And remember what the author said: today is the best era for data engineering. The demand is real, the tools are mature, and the career path is clear. The hardest part is starting. If you have made it this far, you already did.
This is part of my retelling of “Data Engineering with Google Cloud Platform” by Adi Wijaya. Go back to Chapter 13 Part 1: Growing as a Data Engineer or continue to the closing post with final thoughts.