The Cost of Downtime
Every minute of downtime costs money and trust. Yet most teams still deploy by stopping the old version and starting the new one. This article covers the deployment strategies I use to ship code multiple times per day without users noticing.
Blue-Green Deployment
Blue-green maintains two identical production environments. Blue serves live traffic while green is idle. When deploying, you deploy to green, run your verification suite, then switch the load balancer. Instant rollback is just switching back to blue.
The catch: you need double the infrastructure. For most teams, this is acceptable because the idle environment serves as your staging environment between deployments.
Canary Releases
Canary releases route a small percentage of traffic to the new version, gradually increasing as confidence grows. This catches issues that only appear under real production load:
- Database connection pool exhaustion under real query patterns
- Memory leaks that only trigger with production data volumes
- Third-party API rate limiting that staging does not replicate
# Kubernetes canary with Argo Rollouts
apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 5m}
- setWeight: 25
- pause: {duration: 10m}
- setWeight: 50
- pause: {duration: 15m}
- setWeight: 100
Database Migrations
The hardest part of zero-downtime deployment is database changes. The expand-contract pattern works reliably: first expand the schema to support both old and new code, deploy the new code, then contract by removing old schema elements in a subsequent deployment.
Key Takeaways
- Blue-green for simple apps with clear version boundaries
- Canary for complex systems where risk must be incrementally validated
- Always run database migrations before deploying application code
- Automated rollback on error rate spikes is non-negotiable
- Measure deployment frequency — it is your best indicator of deployment health
Senior Software Engineer specializing in cloud architecture, real-time systems, and enterprise-scale applications.