"Let's log everything in production, just in case."

$50,000/month in CloudWatch costs:

  • 500GB/day of debug logs
  • 99% of logs never read
  • No retention policy set
  • 12 months of accumulation

What we were logging:

  • Every HTTP request body (including large payloads)
  • Every database query with parameters
  • Debug-level trace information
  • Health check responses (every 10 seconds per pod)

The fix:

  • Switched to structured logging
  • Set log levels (WARN in prod, DEBUG in staging)
  • 7-day retention in CloudWatch
  • Long-term logs → S3 with Athena for queries
  • Sampled debug logging (1% of requests)

Result: $50,000/month → $3,200/month.

Lesson: Logs are useful. Paying enterprise storage rates for println("here") is not.


← Zurück zu Erfahrungsberichte