Kubernetes for Database Engineers: A Practical Guide

As someone who spent years managing databases on bare metal and VMs before transitioning to Kubernetes, I've seen both sides. Here's what database engineers need to know about K8s.

Why Databases on Kubernetes?

The industry is moving toward running stateful workloads on Kubernetes. The benefits are real:

Consistent deployment across environments
Self-healing with automated failover
Resource efficiency through bin-packing
Declarative configuration as code

The Gotchas

Storage Classes Matter

Not all storage is created equal. For databases, you need:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: disk.csi.azure.com
parameters:
  skuName: Premium_LRS
  cachingMode: None  # Critical for databases
allowVolumeExpansion: true

Pod Disruption Budgets

Always set PDBs for your database pods. Without them, a node drain can take down your entire cluster:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: cassandra-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: cassandra

Lessons from the Field

After managing Cassandra, DSE, and various databases on K8s, my top advice:

Use operators - don't reinvent lifecycle management
Monitor storage IOPS - it's usually the bottleneck
Test failure scenarios - kill pods, drain nodes, simulate network partitions
Keep backups external - don't rely solely on PVs

The CKA certification gave me the foundation, but production experience taught me the nuances.