Kubernetes Deep Dive

11 min read · Updated 2026-04-25

Kubernetes is a sophisticated distributed system that demonstrates the core principles of modern container orchestration at scale. The platform implements fundamental distributed-systems concepts — coordination, consistency, fault tolerance — through a carefully designed architecture that separates control logic from execution.

When you run kubectl apply -f deployment.yaml, you trigger a coordinated sequence across multiple distributed components — each implementing specific distributed-systems patterns.

Two Planes

Kubernetes splits cleanly into a control plane (decision-making) and a data plane (execution).

Control plane

The brain

Runs on master nodes. Manages cluster state. API server, etcd (state store), scheduler, controller managers. Stateless except for etcd.

Data plane

The hands

Runs on worker nodes. Executes workloads. kubelet (node agent), kube-proxy (service networking), container runtime (containerd, CRI-O).

This split enables horizontal scaling, fault tolerance, and clear separation of concerns — the same architectural principles you’d use for any large distributed system.

Control Plane Components

kube-apiserver

The single source of truth interface. All cluster state changes go through here. Validates, authenticates, persists to etcd. Horizontally scalable.

etcd

Distributed KV store backed by Raft consensus. Holds all cluster state. The reliability of etcd determines the reliability of Kubernetes.

kube-scheduler

Decides which node a new Pod should run on. Considers resource requests, affinity rules, taints/tolerations. Pluggable.

kube-controller-manager

Runs the reconciliation loops (Deployment controller, ReplicaSet controller, Node controller, etc.). Watches state, makes it match desired state.

cloud-controller-manager

Cloud-specific controllers (Load balancers, persistent volumes, node lifecycle). Lets the rest of K8s stay cloud-agnostic.

Data Plane Components

kubelet

Node agent. Watches the API server for Pods assigned to its node. Calls the container runtime to start/stop containers. Reports node and Pod status back.

kube-proxy

Service networking on each node. Maintains iptables/IPVS rules to route service traffic to backend Pods. Or replaced by service-mesh proxies (Cilium, Linkerd).

Container runtime

Actually runs containers. containerd (the default), CRI-O. Implements the Container Runtime Interface (CRI) so K8s doesn't care which runtime you use.

The Reconciliation Loop

The pattern that makes Kubernetes work: declare desired state, the system continuously moves actual state toward it.

1. User declares desired state (Deployment YAML).
2. Controller watches the API for changes.
3. Controller compares desired vs. actual state.
4. Controller takes action to reduce the difference.
5. Loop forever.

Scheduling

When a Pod is created, the scheduler picks a node. The decision involves:

Resource requests

Pod says "I need 100m CPU and 256Mi memory." Scheduler finds nodes with enough free capacity.

Node selectors

Pin pods to nodes with specific labels (e.g., GPU nodes, region-specific nodes).

Affinity / anti-affinity

Pods that should run together (data + worker) or apart (replicas of the same service).

Taints and tolerations

Nodes can repel pods unless the pod explicitly tolerates the taint. Used for dedicated node pools.

Topology spread

Distribute pod replicas across zones for HA.

Custom plugins

Scheduler is pluggable. Custom scoring functions for specialized needs.

Workload Resources

A few of the most-used resource types:

Pod

Smallest deployable unit. One or more containers that share network and storage. Usually managed by higher-level resources.

Deployment

Manages a stateless ReplicaSet. Provides rolling updates and rollback. The default for stateless services.

StatefulSet

For stateful workloads (databases, brokers). Stable network identity, ordered deployment, persistent storage.

DaemonSet

One pod per node (or matching node selector). Used for log collectors, monitoring agents, network plugins.

Job / CronJob

Run-to-completion workloads. Job for one-off; CronJob for scheduled.

HPA / VPA

Horizontal Pod Autoscaler (more replicas) and Vertical Pod Autoscaler (more resources per pod). Auto-scaling based on metrics.

Service and Networking

Pods are ephemeral and have changing IPs. Services provide stable endpoints:

ClusterIP

Internal-only virtual IP. Default service type. Other pods reach it by DNS name (my-service.namespace.svc.cluster.local).

NodePort

Exposes service on each node's IP at a static port. For dev/testing or specific use cases.

LoadBalancer

Provisions a cloud LB (ELB on AWS, etc.). Standard for public services in cloud environments.

Ingress

HTTP/HTTPS routing. Path-based, host-based. nginx-ingress, Traefik, AWS ALB Controller. The right way to expose multiple services through one LB.

Configuration and Secrets

ConfigMap

Non-sensitive config

Key-value pairs. Mounted as files or environment variables. Application config, feature flags, public URLs.

Secret

Sensitive data

Same shape as ConfigMap, but base64-encoded and (optionally) encrypted at rest. Database passwords, API keys, certs.

For production, integrate with a real secret manager (AWS Secrets Manager, Vault) via External Secrets Operator. Native Secrets are not enough on their own.

Storage

Volume

Storage attached to a Pod. Many types: emptyDir (ephemeral), configMap, secret, hostPath, NFS, cloud volumes.

PersistentVolume / PersistentVolumeClaim

Decoupled storage. Cluster admin provisions PVs; users claim them via PVCs. Durable across pod restarts.

StorageClass

Defines how PVs are dynamically provisioned. EBS gp3, NFS, etc. Each cluster typically has multiple StorageClasses for different needs.

CSI (Container Storage Interface)

Standard interface for storage drivers. Cloud providers, vendors implement CSI; Kubernetes stays storage-agnostic.

Operators and CRDs

Custom Resource Definitions (CRDs) let you extend the Kubernetes API. Operators package up CRDs with controllers — exposing higher-level abstractions specific to your domain.

# A custom resource managed by an operator
apiVersion: postgres.example.com/v1
kind: PostgresCluster
metadata:
  name: my-db
spec:
  replicas: 3
  storage: 100Gi
  version: "15"

The operator watches PostgresCluster resources and reconciles them — provisioning Postgres clusters, handling failover, managing backups. The user thinks at the level of “I want a Postgres cluster” rather than “I want StatefulSets, PVCs, ConfigMaps…”. Operators are how complex stateful systems get well-managed in K8s.

Multi-Tenancy in Kubernetes

For multi-tenant SaaS, K8s supports multiple isolation strategies:

Namespace per tenant

Logical separation, RBAC scoping, ResourceQuotas. Cheapest. Suits cooperative multi-tenancy where you trust your tenants.

Cluster per tenant

Strongest isolation. Higher operational cost. Required for hard compliance (HIPAA per-tenant, certain regulated industries).

Virtual clusters

vcluster, Capsule. Each tenant gets a virtualized control plane on a shared host cluster. Middle ground.

Hybrid

Big enterprise tenants on dedicated clusters; everyone else on shared with namespace isolation. Common in B2B SaaS.

What Makes Kubernetes Hard

The flip side: when you do master it, you have a portable, declarative, self-healing, automation-friendly platform that the entire cloud-native ecosystem builds on.

Recap

Kubernetes splits into control plane (etcd, API server, scheduler, controllers) and data plane (kubelet, kube-proxy, runtime).
The reconciliation loop pattern: declare desired state, controllers continuously move actual state toward it.
Scheduling considers resources, affinity, taints/tolerations, topology.
Core resources: Pod, Deployment (stateless), StatefulSet (stateful), DaemonSet, Job/CronJob.
Services and Ingress provide stable endpoints and routing.
Configuration via ConfigMaps and Secrets; Operators + CRDs extend the API for domain-specific abstractions.
Multi-tenancy strategies range from namespace-per-tenant to cluster-per-tenant.
The learning curve is steep — but the operational payoff at scale is substantial.