Data Engineering With AWS Chapter 9 Part 2: Bridging Data Lake and Data Warehouse
This is post 15 in my Data Engineering with AWS retelling series.
In Part 1, we looked at Redshift internals – clusters, slices, distribution styles, sort keys. All the pieces that make a data warehouse fast. But a warehouse sitting in isolation is not very useful. Data needs to flow in from your data lake, and sometimes it needs to flow back out. Part 2 of Chapter 9 covers that bridge between S3 and Redshift, including Redshift Spectrum, the COPY and UNLOAD commands, and a hands-on exercise that ties it all together.