What is "State"?
Before Kubernetes, understand one word: "State".
State = Data that an application remembers over time.
Some apps remember things (your login, shopping cart, messages). Some don't need to remember anything.
A calculator app — Type 2 + 2, get 4, close it. No memory needed. No state.
A bank app — Must remember your balance. Deposit $100, it MUST save that. Has state.
⚙ Two Types of Applications
Stateless Application
Does NOT save data between requests. Every request is brand new.
- Web servers (Nginx, Apache)
- API gateways
- Load balancers
- Image resizing services
- Token validators
Stateful Application
MUST save data. Data persists between requests and restarts.
- Databases (MySQL, PostgreSQL)
- Message queues (Kafka, RabbitMQ)
- Cache systems (Redis)
- File storage systems
- Search engines (Elasticsearch)
Real-World Analogy: Restaurant vs Hospital
Any worker can take your order. If Worker A is busy, Worker B helps. It doesn't matter WHO serves you. If a worker goes on break, another replaces them instantly. No worker needs to "remember" you.
Each doctor has specific patients with medical records. You can't swap Doctor A with Doctor B — Doctor B doesn't know your history! Records are critical and must not be lost.
Stateless = Workers are replaceable, no memory needed.
Stateful = Each instance has unique data that MUST be preserved.
Stateless Applications — Deep Dive
🔍 How Stateless Apps Work
No data is lost because no data was stored!
Stateless app: 3 identical pods behind a load balancer
✓ Characteristics
- No local data storage — Nothing important saved on disk
- Identical pods — Every replica is the same
- Interchangeable — Any pod handles any request
- Easy to scale — Just add more pods!
- Easy to replace — Kill a pod, start new one
- No startup order — Start in any order
💥 When a Pod Dies
Stateful Applications — Deep Dive
🔍 How Stateful Apps Work
If mysql-1 dies, the new mysql-1 gets the SAME disk back!
Stateful app: unique names and dedicated storage per pod
✓ Characteristics
- Persistent storage — Own disk that survives restarts
- Unique identity — Predictable names: app-0, app-1, app-2
- Ordered startup — 0 first, then 1, then 2
- Ordered shutdown — 2 first, then 1, then 0
- Stable network identity — Permanent DNS name per pod
- Not interchangeable — mysql-0 ≠ mysql-1
💥 When a Pod Dies
Side-by-Side Comparison
| Feature | Stateless | Stateful |
|---|---|---|
| K8s Resource | Deployment | StatefulSet |
| Pod Names | Random (nginx-7d9f8b-x4kl2) | Predictable (mysql-0, mysql-1) |
| Storage | None needed | Own Persistent Volume per pod |
| Network ID | Single Service IP | Own DNS per pod |
| Scale Up | Any order, instant | Ordered (0→1→2) |
| Scale Down | Any order | Reverse (2→1→0) |
| Replacement | Fresh pod | Same name + same disk |
| Complexity | Simple | Complex |
| Examples | Nginx, Node.js, React | MySQL, PostgreSQL, Kafka |
| Service | ClusterIP / LoadBalancer | Headless (clusterIP: None) |
Kubernetes Concepts You Need First
📦 Pod
Smallest unit in K8s. A "wrapper" around your container. Usually one container per pod.
(your app: nginx)
🚀 Deployment (Stateless)
Tells K8s: "Run 3 copies of my app. If one dies, make a new one." Manages stateless apps.
🔒 StatefulSet (Stateful)
Like Deployment but with superpowers: stable names, stable storage, ordered operations.
🔗 Service
A stable "phone number" for your pods. Pods come and go, but the Service IP stays the same.
💾 PV & PVC
PV = actual disk. PVC = "I need X GB of disk." K8s matches them together.
Deploying Stateless — Step by Step
Deploy Nginx web server. Stateless because it just serves pages — no data saved.
apiVersion: apps/v1 kind: Deployment # Stateless! metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 # 3 identical copies selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.25 ports: - containerPort: 80 resources: requests: { memory: "64Mi", cpu: "100m" } limits: { memory: "128Mi", cpu: "250m" }
apiVersion: v1 kind: Service metadata: name: nginx-service spec: type: ClusterIP selector: { app: nginx } ports: - port: 80 targetPort: 80
$ kubectl apply -f nginx-deployment.yaml
$ kubectl apply -f nginx-service.yaml
$ kubectl get pods
NAME READY STATUS AGE
nginx-deployment-7d9f8b7945-ab12c 1/1 Running 30s
nginx-deployment-7d9f8b7945-cd34e 1/1 Running 30s
nginx-deployment-7d9f8b7945-ef56g 1/1 Running 30s
# Pod names are RANDOM!
2 files, a few commands, done. No disks, no ordering, no special naming.
Deploying Stateful — Step by Step
Deploy MySQL. Stateful — data must survive restarts.
Need: persistent storage, stable names, ordered startup, and a headless service. 4 extra things!
apiVersion: v1 kind: Service metadata: name: mysql-headless spec: clusterIP: None # THIS makes it headless selector: { app: mysql } ports: - port: 3306
mysql-0.mysql-headless.default.svc.cluster.local
mysql-1.mysql-headless.default.svc.cluster.local
mysql-2.mysql-headless.default.svc.cluster.local
apiVersion: apps/v1 kind: StatefulSet # NOT Deployment! metadata: name: mysql spec: serviceName: mysql-headless # REQUIRED replicas: 3 selector: matchLabels: { app: mysql } template: metadata: labels: { app: mysql } spec: containers: - name: mysql image: mysql:8.0 ports: - containerPort: 3306 env: - name: MYSQL_ROOT_PASSWORD value: "my-secret-password" volumeMounts: - name: mysql-data mountPath: /var/lib/mysql # KEY DIFFERENCE: auto-create disk per pod volumeClaimTemplates: - metadata: name: mysql-data spec: accessModes: ["ReadWriteOnce"] storageClassName: standard resources: requests: storage: 10Gi
$ kubectl apply -f mysql-headless-svc.yaml $ kubectl apply -f mysql-statefulset.yaml $ kubectl get pods -w mysql-0 1/1 Running 45s # First mysql-1 1/1 Running 30s # After mysql-0 ready mysql-2 1/1 Running 15s # After mysql-1 ready $ kubectl get pvc mysql-data-mysql-0 Bound 10Gi mysql-data-mysql-1 Bound 10Gi mysql-data-mysql-2 Bound 10Gi
Stateless: 2 files, random names, no storage. Stateful: 2+ files, predictable names, storage, headless service, ordered startup.
Why Stateful is Harder — 5 Challenges
💾 1. Persistent Storage
Each pod needs its OWN disk surviving restarts. Need PV, PVC, StorageClass knowledge.
🔄 2. Ordered Operations
Start in order (0,1,2), stop in reverse (2,1,0). Master must be ready before replicas.
🌐 3. Unique Network Identity
Headless Service needed to talk to a SPECIFIC pod (e.g., the master).
💾 4. Backup & Recovery
Stateless: no backups needed. Stateful: regular backups, disaster recovery, tested restores.
🔁 5. Data Replication
Master-replica setup with init containers, sidecar scripts. StatefulSet doesn't do this for you.
Persistent Volumes — Deep Dive
🧩 3 Pieces of the Storage Puzzle
SSD? HDD? Cloud?
10GB SSD on AWS
10GB, SSD, RWO
📄 StorageClass Example
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast-ssd provisioner: kubernetes.io/aws-ebs parameters: type: gp3 # Fast SSD reclaimPolicy: Retain # Keep disk if PVC deleted
🔒 Access Modes
| Mode | Short | Meaning | Use Case |
|---|---|---|---|
| ReadWriteOnce | RWO | One pod reads/writes | Databases |
| ReadOnlyMany | ROX | Many pods read | Config files |
| ReadWriteMany | RWX | Many pods read/write | Shared NFS |
Headless Service — What & Why
Normal Service
Single IP. Random pod selection.
Good for: Stateless
Headless Service
No IP. Each pod has own DNS name.
Good for: Stateful
Databases write to master (pod-0), read from replicas. Must target specific pods. Headless Service makes this possible.
Complete Example: Stateless Nginx
--- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-web spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: { maxUnavailable: 1, maxSurge: 1 } selector: matchLabels: { app: nginx-web } template: metadata: labels: { app: nginx-web } spec: containers: - name: nginx image: nginx:1.25-alpine ports: - containerPort: 80 resources: requests: { cpu: "100m", memory: "64Mi" } limits: { cpu: "250m", memory: "128Mi" } readinessProbe: httpGet: { path: /, port: 80 } livenessProbe: httpGet: { path: /, port: 80 } --- apiVersion: v1 kind: Service metadata: name: nginx-web-svc spec: type: LoadBalancer selector: { app: nginx-web } ports: - port: 80 targetPort: 80
Complete Example: Stateful MySQL
# Secret + Headless Service + StatefulSet --- apiVersion: v1 kind: Secret metadata: { name: mysql-secret } type: Opaque data: root-password: bXktc2VjcmV0LXBhc3M= --- apiVersion: v1 kind: Service metadata: { name: mysql-headless } spec: clusterIP: None selector: { app: mysql } ports: [{ port: 3306 }] --- apiVersion: apps/v1 kind: StatefulSet metadata: { name: mysql } spec: serviceName: mysql-headless replicas: 3 selector: matchLabels: { app: mysql } template: metadata: labels: { app: mysql } spec: terminationGracePeriodSeconds: 30 containers: - name: mysql image: mysql:8.0 ports: [{ containerPort: 3306 }] env: - name: MYSQL_ROOT_PASSWORD valueFrom: secretKeyRef: name: mysql-secret key: root-password resources: requests: { cpu: "500m", memory: "512Mi" } limits: { cpu: "1000m", memory: "1Gi" } volumeMounts: - name: mysql-data mountPath: /var/lib/mysql readinessProbe: exec: command: ["mysqladmin","ping","-h","127.0.0.1"] initialDelaySeconds: 30 volumeClaimTemplates: - metadata: { name: mysql-data } spec: accessModes: ["ReadWriteOnce"] storageClassName: standard resources: requests: { storage: 10Gi }
How Scaling Works Differently
Stateless Scaling
# Scale up (instant, parallel) $ kubectl scale deployment nginx \ --replicas=10 # Scale down (any order) $ kubectl scale deployment nginx \ --replicas=2
Stateful Scaling
# Scale up (ONE by ONE) $ kubectl scale statefulset mysql \ --replicas=5 # Scale down (REVERSE order) $ kubectl scale statefulset mysql \ --replicas=2 # PVCs NOT deleted!
Stateless: Pods gone, no data loss. Stateful: Pods gone but PVCs kept. Scale back up = data returns. (You still pay for disks!)
When to Use Which?
Use Deployment (Stateless)
- App doesn't save data locally
- All pods are identical
- Losing a pod is no problem
- Easy, fast scaling needed
- Data lives in external DB
Use StatefulSet (Stateful)
- Each pod needs own storage
- Stable DNS names needed
- Database or message broker
- Ordered startup/shutdown
- Master-replica topology
Most apps are deployed stateless, connecting to managed database services (AWS RDS, Cloud SQL). This avoids StatefulSet complexity entirely!
Common Beginner Mistakes
Pod restart = ALL DATA GONE. Use StatefulSet.
StatefulSet requires it. No headless = no stable DNS.
One pod eats all resources, crashes everything.
Use Kubernetes Secrets. Never commit passwords.
StatefulSet manages pods/storage/names only. Replication = your job.
PVCs survive StatefulSet deletion (safety). Manual delete = data gone forever.
Cheat Sheet & Quick Reference
⚡ kubectl Commands
# ===== STATELESS ===== kubectl create deployment nginx --image=nginx kubectl get deployments kubectl scale deployment nginx --replicas=5 kubectl rollout status deployment/nginx kubectl rollout undo deployment/nginx # ===== STATEFUL ===== kubectl get statefulsets kubectl get pvc kubectl scale statefulset mysql --replicas=5 kubectl delete statefulset mysql # PVCs remain! # ===== GENERAL ===== kubectl get pods -w # Watch live kubectl describe pod mysql-0 kubectl logs mysql-0 kubectl exec -it mysql-0 -- bash kubectl get all
🛠 Decision Flowchart
Stateless
or stable DNS?
Stateless = simple, replaceable. Stateful = complex, data-safe. Start with stateless. When comfortable, explore stateful. Practice with Minikube or Kind!