Data Engineering with GCP Chapter 7: Making Data Visual with Looker Studio

You spend weeks building pipelines, modeling data, setting up orchestration. Everything works. Data lands in BigQuery clean and on time. And then someone from the business side asks: “So… where do I see the numbers?” That is exactly where Chapter 7 picks up. All that upstream work has to end somewhere useful, and for most organizations that somewhere is a dashboard.

Why Data Visualization Matters for Data Engineers

Adi makes a point early in the chapter that I think a lot of data engineers need to hear. Visualization is not your main job. You are not the BI team. But understanding how your data gets consumed downstream is essential because the success of your entire pipeline is often measured by how happy the people looking at charts are.

There are two reasons you would visualize data. First is exploration: debugging a pipeline, checking query performance, seeing how much BigQuery is billing you. A quick chart shows patterns faster than scrolling through rows. Second is reporting: dashboards that stakeholders open every morning, interactive views where people filter and slice data themselves.

The book calls both of these “reports” to match Looker Studio terminology. Dashboards, reports, visualizations. It all means “show me the data without writing SQL.”

What Is Looker Studio

Looker Studio is Google’s free, fully cloud-based visualization tool. You open it in a browser, connect it to a data source, drag some charts around, and share a link. No software to install, no servers to manage.

The key selling point for GCP users is that the connection to BigQuery is basically seamless. You can go from a BigQuery query result to a Looker Studio chart in under 10 clicks. The book demonstrates this by querying BigQuery’s INFORMATION_SCHEMA (which stores metadata about your queries and costs), clicking the “Explore Data” button, and immediately seeing a chart in Looker Studio. That is genuinely fast.

But here is the caveat. Looker Studio is not a full BI platform. It is not a replacement for Tableau, Power BI, or even Google’s own Looker (a separate paid product). No version control for dashboards, no deep permission hierarchy. It is a lightweight tool that is really good at one thing: getting BigQuery data into charts quickly.

Looker Studio vs Looker: Do Not Confuse Them

This trips people up all the time. Looker Studio and Looker are two completely different products that happen to share a name. Looker Studio used to be called Data Studio. Google renamed it, which made the confusion worse.

Looker Studio has a free tier and a paid Pro tier. Looker is entirely paid, requires dedicated Looker developers and admins, and has significantly more features. If someone asks you to “set up Looker,” clarify which one they mean before you do anything.

Connecting to BigQuery and Building Reports

The book walks through two exercises. The first one is quick exploration: run a query in BigQuery, click “Explore Data,” and you are in Looker Studio looking at your results as a chart. This is great for ad hoc analysis. Want to see your daily BigQuery spending as a time series? Query the INFORMATION_SCHEMA jobs table, group by date, sum up total_bytes_billed, and visualize it. Done in minutes.

The second exercise is more structured. You create a report from scratch by connecting to BigQuery tables directly (not just query results). The book uses the bike-sharing warehouse tables from Chapter 3: facts_trips_daily and dim_stations.

The interesting part is data blending. You add both tables as data sources, then use the “Blend Data” feature to join them. Pick a join key (station_id), choose dimensions and metrics, and you have a combined dataset across both tables. It works like a JOIN in SQL, but through a visual interface.

Chart Types and Report Layout

Looker Studio offers the usual suspects: bar charts, time series, treemaps, tables, pie charts, and more. Each chart type needs a dimension (the thing you are grouping by) and a metric (the number you are measuring). The tool tries to auto-pick these for you, which is sometimes helpful and sometimes wrong. Always double-check.

The book builds three charts in the report: a bar chart showing top 5 stations by total trip duration, a treemap showing top 10 stations by average duration, and a plain table with all the numbers. You can limit rows, change sort orders, and style everything from the right panel.

Once your charts look right, there is the layout step. Drag things around, add titles, adjust spacing. Remember that you are making this for other people, so think about what makes sense to someone who did not build the report.

Sharing Reports with Stakeholders

Sharing is simple. Click the Share button and add email addresses. The important detail here: when you share a report, the recipients do not need BigQuery access. They see the data through your credentials as the data source owner. This is both convenient and something to be aware of from a security perspective.

The Cost Problem Nobody Talks About

This is my favorite section of the chapter because it covers something that catches a lot of teams off guard. Looker Studio is free. BigQuery queries are not.

Every time someone opens your Looker Studio report, it fires queries against BigQuery. Every filter change, every page refresh, every chart interaction. It all costs money if you are on BigQuery on-demand pricing.

The book breaks down the math. BigQuery cost is basically: data size multiplied by number of queries multiplied by price per TB ($5 in the US region). So if you have a 1 TB table and 10 users each triggering 1,440 queries per month (that is a refresh every 30 minutes), you are looking at $50,000 per month. Just from people looking at dashboards.

The fix is about architecture. If you followed the data modeling advice from Chapter 3, your reporting layer (the “Access” layer) should contain pre-aggregated, filtered data. Small tables. Fact tables at daily granularity instead of raw transaction-level data. The big raw tables should never be directly connected to a reporting tool.

Materialized Views and BI Engine

The book introduces two caching mechanisms to bring costs down.

Materialized Views sit between a regular view and a physical table. BigQuery pre-computes and caches the results. When someone runs a matching query, BigQuery uses the cache instead of scanning the full table. In the book’s example, bytes processed dropped from 14.9 KB to 64 bytes. On production-scale tables, that reduction is the difference between a reasonable bill and a budget crisis. Your users do not need to change their queries. BigQuery automatically detects when a materialized view applies.

BI Engine is an in-memory cache designed for BI tools like Looker Studio. You enable it and set a memory capacity. As long as query results fit within that capacity (up to 100 GB per project), Looker Studio queries are essentially free. You can tell if a chart uses BI Engine by the lightning icon in the corner. Line through the icon means the data does not fit.

The key takeaway: use both when running Looker Studio in production. Materialized Views for aggregate queries, BI Engine for everything else that fits in memory.

What I Took Away

Chapter 7 is shorter than the previous ones, and honestly that is fine. The message is clear. Data visualization is the last mile of data engineering. You can have perfect pipelines and clean data models, but if nobody can see the results in a way that helps them make decisions, what was the point?

The practical lesson is that Looker Studio is an excellent tool for quick, simple dashboards on top of BigQuery. The strategic lesson is that connecting a visualization tool to your data warehouse has cost implications that you need to think about before you share that dashboard link with 200 people.


This is part of my retelling of “Data Engineering with Google Cloud Platform” by Adi Wijaya. Go back to Chapter 6 Part 2: Dataflow Stream Processing or continue to Chapter 8 Part 1: Machine Learning on GCP.

About

About BookGrill.net

BookGrill.net is a technology book review site for developers, engineers, and anyone who builds things with code. We cover books on software engineering, AI and machine learning, cybersecurity, systems design, and the culture of technology.

Know More