1K to 10K RPS Complexity Explosion
Scaling from 1K to 10K requests per second wasn't 10x harder. It was 100x harder.
What worked at 1K RPS:
- Synchronous service-to-service calls
- Database joins for complex queries
- Logs to CloudWatch without sampling
- Simple round-robin load balancing
What broke at 10K RPS:
- Database connection limits hit
- Synchronous calls created cascading timeouts
- CloudWatch costs exploded (10x logs = 10x cost)
- Hot keys in caching layer
- Network socket exhaustion
New patterns needed:
- Event-driven instead of request-response
- Read replicas and connection pooling
- Log sampling (1% at debug level)
- Rate limiting at edge
- Consistent hashing for cache distribution
The insight:
At each order of magnitude, you're solving different problems. Architecture that works for 1K won't work for 10K. What works for 10K won't work for 100K.
Lesson: Scale testing isn't optional. Your 10x traffic day will find every weakness.