20,000 Microservices Migration
Enterprise migrated 20,000 microservices to Kubernetes. Only 30% worked on day one.
The plan:
- "Just containerize and deploy"
- Automated migration tooling
- Big bang cutover weekend
What went wrong:
- 14,000 services failed health checks
- Undocumented hardcoded IPs everywhere
- Services expecting specific filesystem paths
- Hidden dependencies on hostname formats
- Time zone assumptions (container default: UTC)
The discoveries:
- Service A wrote temp files to /opt/app/tmp (not writable)
- Service B parsed its own IP to determine environment
- Service C expected to be on a server named "prod-db-01"
- 2,000+ services had hardcoded `localhost` references
What should have been done:
- Incremental migration (start with stateless)
- Container-readiness checklist per service
- Canary deployments with dual-write
- Realistic timeline (months, not weeks)
Lesson: Containerization is not just "docker build". It's infrastructure archaeology.