Kubernetes Deep Dive

11 min read Β· Updated 2026-04-25

Kubernetes is a sophisticated distributed system that demonstrates the core principles of modern container orchestration at scale. The platform implements fundamental distributed-systems concepts β€” coordination, consistency, fault tolerance β€” through a carefully designed architecture that separates control logic from execution.

When you run kubectl apply -f deployment.yaml, you trigger a coordinated sequence across multiple distributed components β€” each implementing specific distributed-systems patterns.

Two Planes

Kubernetes splits cleanly into a control plane (decision-making) and a data plane (execution).

Control plane
The brain
Runs on master nodes. Manages cluster state. API server, etcd (state store), scheduler, controller managers. Stateless except for etcd.
Data plane
The hands
Runs on worker nodes. Executes workloads. kubelet (node agent), kube-proxy (service networking), container runtime (containerd, CRI-O).

This split enables horizontal scaling, fault tolerance, and clear separation of concerns β€” the same architectural principles you’d use for any large distributed system.

Control Plane Components

kube-apiserver
The single source of truth interface. All cluster state changes go through here. Validates, authenticates, persists to etcd. Horizontally scalable.
etcd
Distributed KV store backed by Raft consensus. Holds all cluster state. The reliability of etcd determines the reliability of Kubernetes.
kube-scheduler
Decides which node a new Pod should run on. Considers resource requests, affinity rules, taints/tolerations. Pluggable.
kube-controller-manager
Runs the reconciliation loops (Deployment controller, ReplicaSet controller, Node controller, etc.). Watches state, makes it match desired state.
cloud-controller-manager
Cloud-specific controllers (Load balancers, persistent volumes, node lifecycle). Lets the rest of K8s stay cloud-agnostic.

Data Plane Components

kubelet
Node agent. Watches the API server for Pods assigned to its node. Calls the container runtime to start/stop containers. Reports node and Pod status back.
kube-proxy
Service networking on each node. Maintains iptables/IPVS rules to route service traffic to backend Pods. Or replaced by service-mesh proxies (Cilium, Linkerd).
Container runtime
Actually runs containers. containerd (the default), CRI-O. Implements the Container Runtime Interface (CRI) so K8s doesn't care which runtime you use.

The Reconciliation Loop

The pattern that makes Kubernetes work: declare desired state, the system continuously moves actual state toward it.

1. User declares desired state (Deployment YAML).
2. Controller watches the API for changes.
3. Controller compares desired vs. actual state.
4. Controller takes action to reduce the difference.
5. Loop forever.

Scheduling

When a Pod is created, the scheduler picks a node. The decision involves:

Resource requests
Pod says "I need 100m CPU and 256Mi memory." Scheduler finds nodes with enough free capacity.
Node selectors
Pin pods to nodes with specific labels (e.g., GPU nodes, region-specific nodes).
Affinity / anti-affinity
Pods that should run together (data + worker) or apart (replicas of the same service).
Taints and tolerations
Nodes can repel pods unless the pod explicitly tolerates the taint. Used for dedicated node pools.
Topology spread
Distribute pod replicas across zones for HA.
Custom plugins
Scheduler is pluggable. Custom scoring functions for specialized needs.

Workload Resources

A few of the most-used resource types:

Pod
Smallest deployable unit. One or more containers that share network and storage. Usually managed by higher-level resources.
Deployment
Manages a stateless ReplicaSet. Provides rolling updates and rollback. The default for stateless services.
StatefulSet
For stateful workloads (databases, brokers). Stable network identity, ordered deployment, persistent storage.
DaemonSet
One pod per node (or matching node selector). Used for log collectors, monitoring agents, network plugins.
Job / CronJob
Run-to-completion workloads. Job for one-off; CronJob for scheduled.
HPA / VPA
Horizontal Pod Autoscaler (more replicas) and Vertical Pod Autoscaler (more resources per pod). Auto-scaling based on metrics.

Service and Networking

Pods are ephemeral and have changing IPs. Services provide stable endpoints:

ClusterIP
Internal-only virtual IP. Default service type. Other pods reach it by DNS name (my-service.namespace.svc.cluster.local).
NodePort
Exposes service on each node's IP at a static port. For dev/testing or specific use cases.
LoadBalancer
Provisions a cloud LB (ELB on AWS, etc.). Standard for public services in cloud environments.
Ingress
HTTP/HTTPS routing. Path-based, host-based. nginx-ingress, Traefik, AWS ALB Controller. The right way to expose multiple services through one LB.

Configuration and Secrets

ConfigMap
Non-sensitive config
Key-value pairs. Mounted as files or environment variables. Application config, feature flags, public URLs.
Secret
Sensitive data
Same shape as ConfigMap, but base64-encoded and (optionally) encrypted at rest. Database passwords, API keys, certs.

For production, integrate with a real secret manager (AWS Secrets Manager, Vault) via External Secrets Operator. Native Secrets are not enough on their own.

Storage

Volume
Storage attached to a Pod. Many types: emptyDir (ephemeral), configMap, secret, hostPath, NFS, cloud volumes.
PersistentVolume / PersistentVolumeClaim
Decoupled storage. Cluster admin provisions PVs; users claim them via PVCs. Durable across pod restarts.
StorageClass
Defines how PVs are dynamically provisioned. EBS gp3, NFS, etc. Each cluster typically has multiple StorageClasses for different needs.
CSI (Container Storage Interface)
Standard interface for storage drivers. Cloud providers, vendors implement CSI; Kubernetes stays storage-agnostic.

Operators and CRDs

Custom Resource Definitions (CRDs) let you extend the Kubernetes API. Operators package up CRDs with controllers β€” exposing higher-level abstractions specific to your domain.

# A custom resource managed by an operator
apiVersion: postgres.example.com/v1
kind: PostgresCluster
metadata:
  name: my-db
spec:
  replicas: 3
  storage: 100Gi
  version: "15"

The operator watches PostgresCluster resources and reconciles them β€” provisioning Postgres clusters, handling failover, managing backups. The user thinks at the level of β€œI want a Postgres cluster” rather than β€œI want StatefulSets, PVCs, ConfigMaps…”. Operators are how complex stateful systems get well-managed in K8s.

Multi-Tenancy in Kubernetes

For multi-tenant SaaS, K8s supports multiple isolation strategies:

Namespace per tenant
Logical separation, RBAC scoping, ResourceQuotas. Cheapest. Suits cooperative multi-tenancy where you trust your tenants.
Cluster per tenant
Strongest isolation. Higher operational cost. Required for hard compliance (HIPAA per-tenant, certain regulated industries).
Virtual clusters
vcluster, Capsule. Each tenant gets a virtualized control plane on a shared host cluster. Middle ground.
Hybrid
Big enterprise tenants on dedicated clusters; everyone else on shared with namespace isolation. Common in B2B SaaS.

What Makes Kubernetes Hard

The flip side: when you do master it, you have a portable, declarative, self-healing, automation-friendly platform that the entire cloud-native ecosystem builds on.

Recap