Deploying without losing sleep

One of the most critical parts of the software development life cycle is giving our web-app an update. In which case, the downtime becomes the elephant in the room (or more like an elephant sitting on your chest). So here we will go over some deployment ways with downtime as major decider.

Big Bang Deployment

With this deployment strategy, there is a major downtime as you are literally updating the machine (that is already on use) with your new version. Rolling back is not easy as you need to re-deploy the old version in the same machine again.

Rolling Deployment

Here instead of just using one machine, we will use multiple machines. And we will do the big bang like updates in these, one by one. But first, we remove one machine out of load balancer and wait for existing requests to be completed. Then we update the app and restart the process and then re-attach it to the load balancer. And we do it one by one to all our machines. So at the same time, none of your users experience downtime.

But doing it one by one might be very time consuming process at scale, so we can do it in two ways :

Concurrent machines - where we pick N of machine (5 or 20 or 100 or even more) and we do the rolling in batches of the N. This number N should not be too large or too small for your scale.
Double-half machines - With this if we have 10 servers running, then we launch 10 more servers with new code and then we slowly terminated the first half (old) servers. It is more costly and also we need to ensure our DB and Cache are able to handle double the load for some time.

It is a way safer than Big Bang deployment. But there is no target deployment here, as we don’t control the distribution between these machines. And here we need to ensure that all our services are backward compatible and forward compatible, as at the same time we will have both old and new version as live. (Although it is generally good to have backward and forward compatibility irrespective of the deployment strategies.)

Blue-Green Deployment

Here we maintain two sets of machines - Blue set and Green set. One of this set is live to the user. The other set is sitting idle, we can deploy our new version in the idle set. QA team can test it, we can do a sanity check, a core vitals check and once all good, we can make the switch and instantly all the traffic is switched.

Rolling back is easy as we just need to flip the switch again. But this is costly (2x the infra cost) as we have to maintain two copies of every service. And there is no target deployment either as all the user either point new or old set. And stateful applications will take a hit.

Canary Deployment

This is almost similar to rolling deployments with the key difference being how we manage the traffic to the new machines (targeted and intentional). We deploy the new version in some new servers and we route a small percentage of our users to these new machines. We call these new servers as Canary servers. And then we monitor those Canary servers, if all good then we spread the new version to more servers, eventually spreading the change across all machines.

Rollback is easy as we can point those % of the user to other old versioned machines. Here we can do targeted rollouts. It involves a complicated monitoring system. And it can be tricky for DB Schema changes and API compatibility issues (same as the Rolling deployments and we need to ensure that our services are backward and forward compatible). The other feature of this deployment is that, we can test how our update is performing with real user as their can be some cases that would be slipped in all the previous tests.

Feature Flag Deployment

We already talked about it a lot here : Feature Flags: Building better frontend distribution. But in short, with this, we deploy as much as we want but we control the features inside the codebase. It is usually used as a tool with other deployment strategies. It supports targeted rollouts as well.

One of the major cons is, this brings code complexity that needs proper maintenance and extra work (e.g. writing unit tests for both the scenarios in the same code version).

The real deployment unit

One key thing teams usually miss when planing a strategy is that, The real deployment unit is NOT code, it’s: “code + schema + config + traffic control”

Patterns from the best

And if we see the deployment strategies (source : their engineering blogs) of big Product companies (Meta, Google, Slack, Netflix etc), they have some common patterns and from that we can derive a good standard of deployment strategies:

Canary everywhere - Small subset rollout → evaluate → expand
Automation > Humans - Builds, tests, deploys all automated
Observability-driven decisions - Metrics decide rollout, not gut feeling
Rollback is mandatory - Every deployment must be fast reversible

A custom Canary Deployment in practice

Here is one of the custom-built Canary Deployment strategies we implemented in one of our systems, combining progressive traffic routing with feature flags to achieve safe, controlled rollouts across frontend, backend, and database layers.

We started with a system serving 100% of users across our existing infrastructure—frontend servers, API layer services, and a shared database.

To introduce a new feature safely, we deployed the updated frontend code to a small subset of new frontend servers. Similarly, we deployed updated versions of our backend services to a subset of canary backend servers, while the rest of the fleet continued serving stable traffic. All backend changes were designed to be backward compatible and guarded behind feature flags.

At the edge, we placed a load balancer that acts as the entry point for all incoming traffic. When a user first hits the system, the load balancer assigns a small percentage of users (for example, 2–5%) into a canary cohort. This is done by setting a cookie (e.g., x-canary=true) or using other segmentation strategies like geography, internal users, or QA accounts.

As the request flows through the system, this cookie is converted into an internal header (e.g., x-canary: true). This ensures a consistent and reliable signal across all downstream services without tightly coupling internal systems to browser-level constructs.

The load balancer uses this cohort assignment to route canary users consistently to the subset of frontend servers running the new code. From there, the request continues through the system with the same cohort identity preserved.

API layer routing

At the API layer, we rely on a combination of gateway-level routing and in-service feature flags.

An API gateway sits in front of our backend services and is responsible for enforcing routing consistency. It inspects the incoming request header (x-canary) and routes canary traffic specifically to the canary backend servers. This guarantees that a canary-enabled frontend does not accidentally interact with an older backend, which could otherwise lead to inconsistent behavior or missing data.

Since it is not feasible to deploy new backend code to all servers instantly, this approach allows us to incrementally roll out backend changes while maintaining strict compatibility between frontend and backend for canary users.

Feature flags within service logic

Once the request reaches the backend, we introduce a second layer of control using feature flags within the service logic. Even on canary backend servers, new behavior is guarded behind flags. This ensures that:

new code paths can be enabled gradually,
behavior can be toggled instantly without redeployment, and
the system remains backward compatible during the transition

Database strategy

For the database, we maintain a single shared datastore. All schema changes follow a backward-compatible migration strategy using the expand–migrate–contract pattern:

introduce new schema elements (non-breaking changes),
update services to write to both old and new structures if needed,
backfill or migrate data gradually,
switch reads to the new schema, and
finally remove deprecated structures

This avoids splitting data across multiple databases and ensures consistency across all users throughout the rollout.

Deployment sequence

The deployment sequence is critical to making this system work reliably. We first roll out backend changes across the fleet (gradually, via rolling deployments), keeping feature flags turned off. This ensures that most or all servers are capable of handling the new behavior before it is exposed.

Once the backend reaches a safe rollout threshold, we begin canary deployment of the frontend. At this stage, enabling the feature flag for canary users ensures that:

requests are routed to compatible backend servers via the API gateway, and
backend logic safely executes the new behavior

As traffic flows through the system, the canary cohort experiences the new feature end-to-end—frontend, API behavior, and database interactions—while the majority of users remain on the stable path.

We then progressively increase the canary percentage based on system health metrics, error rates, and business KPIs. If any issue is detected, rollback is immediate—either by:

disabling the feature flag, or
stopping canary routing at the load balancer

We need to extra careful if our systems depend/create any queue, if yes, then we need to custom made a strategy for our own architecture using one the above strategies.

I will soon write a blog on : How we can have forward and backward compatibility for DB, Backend and Frontend (including Expand → Migrate → Contract Pattern in DB). Thanks for reading this through 🫶🏻.