Beyond the Basics: The Kubernetes Ecosystem
We have built some incredible pipelines over the last few posts. But if you were to take what we’ve built and put it into production today, you’d quickly realize that there is a lot more to managing a platform than just getting the YAML files right.
In the final chapter of Big Data on Kubernetes, Neylson Crepalde gives us a roadmap for everything else we need to master to be “production-ready.”
Observability: Knowing what’s happening
When a Spark job fails at 3 AM, you don’t want to be manually digging through kubectl logs. You need Observability.
- Prometheus & Grafana: These are the gold standards for collecting metrics and visualizing them. You need to know your CPU usage, memory pressure, and network latency before the cluster crashes.
- Tracing: Tools like Jaeger help you follow a single piece of data as it travels from Kafka through Spark and into Elasticsearch.
Networking: The Service Mesh
As you add more tools, managing the communication between them becomes a nightmare. This is where a Service Mesh (like Istio or Linkerd) comes in. It handles load balancing, retries, and encryption between your services automatically. It’s like an “advanced networking” layer that lives on top of Kubernetes.
Security: Trust No One
Big data is often sensitive data. You can’t just leave your cluster open.
- RBAC: Fine-tune who can do what. Your analysts should be able to query Trino, but they shouldn’t be able to delete your Kafka brokers.
- Network Policies: Lock down your namespaces so that only the necessary pods can talk to each other.
- Secrets Management: Use external vaults (like HashiCorp Vault or AWS Secrets Manager) instead of just plain Kubernetes Secrets.
GitOps: The Single Source of Truth
We’ve talked about “DevOps for Data,” but the ultimate version of this is GitOps. With tools like ArgoCD, your Git repository becomes the master controller. If you change a YAML file in Git, ArgoCD automatically updates your cluster to match. It ensures that your production environment never “drifts” from your configuration.
FinOps: Keeping the Lights On
Kubernetes makes it easy to scale, which also makes it easy to spend a lot of money very quickly. Kubernetes Cost Control (often called FinOps) is about using tools like Kubecost to see exactly how much each team or project is costing you in cloud fees. It’s about “right-sizing” your resources so you aren’t paying for idle CPU cores.
The Team
The book ends with a great point: this isn’t just a technology challenge; it’s a people challenge. To run a platform like this, you need a blend of:
- Data Engineers who understand the logic.
- DevOps/SREs who understand the infrastructure.
- FinOps who understand the business value.
It’s been a wild ride through this book. In the final closing post, I’ll share my overall impressions and key takeaways for anyone looking to start this journey.
Next: Wrapping Up: Big Data on Kubernetes Previous: Action Models with Bedrock Agents
Book Details:
- Title: Big Data on Kubernetes: A practical guide to building efficient and scalable data solutions
- Author: Neylson Crepalde
- ISBN: 978-1-83546-214-0