Horizontal Scaling Bottleneck
We scaled horizontally to 50 instances. The bottleneck just moved.
Phase 1: Application bottleneck
- 5 app servers, 100% CPU
- Solution: Scale to 20 servers
- Result: App CPU 30%
Phase 2: Database bottleneck
- Database: 100% CPU, 5000 connections
- Solution: Connection pooling, read replicas
- Result: Database CPU 60%
Phase 3: Load balancer bottleneck
- Single ALB hitting packet limit
- Solution: Multiple ALBs with DNS round-robin
- Result: Traffic distributed
Phase 4: Message queue bottleneck
- RabbitMQ single node saturated
- Solution: Clustered queue, partitioned topics
- Result: Messages flowing
The lesson:
Amdahl's Law: Speedup limited by sequential parts
System throughput = min(
app_capacity,
db_capacity,
network_capacity,
queue_capacity,
external_api_capacity
)
What we learned:
- Identify the bottleneck BEFORE scaling
- Load test the entire path
- Monitor all components, not just apps
- Sometimes vertical scaling is cheaper
Lesson: Scaling one component moves the bottleneck to the next. Plan for end-to-end capacity.