Autoscaling That Killed the Database

LEKTION #15

Autoscaling That Killed the Database

Our Kubernetes cluster autoscaled perfectly. Too perfectly.

The incident:

Traffic spike at 10 AM
HPA scaled pods: 5 → 50
Each pod: 20 database connections
Total connections: 100 → 1,000
PostgreSQL max_connections: 200
💥 "too many connections" errors

The cascade:

New pods can't connect to database
Health checks fail
Pods restart
Connection storm on restart
Entire cluster thrashing

The fix:

PgBouncer as connection pooler
1,000 app connections → 100 database connections
HPA max replicas capped at sustainable level
Readiness probe waits for DB connection

Lesson: Autoscaling doesn't know about downstream limits. You must enforce them.

← Zurück zu Erfahrungsberichte

Tags: #Kubernetes #Scaling #Database #Performance

Graf Clouds

H.K. Zapaden Park, BL.106 Sofia

[email protected]

wa.grafclouds.com

tg.grafclouds.com

© GRAF CLOUDS 2024 All Rights Reserved
This website was crafted with the assistance of AI agents.
Datenschutz Cookie-Richtlinie Nutzungsbedingungen
BG DE TR