Zero-Downtime Blue-Green Deployment Pipelines for Python Applications on Linode
Understanding Blue-Green Deployments
Blue-Green deployment is a strategy for minimizing downtime and risk during software releases. It involves maintaining two identical production environments, “Blue” and “Green”. At any given time, one environment (e.g., Blue) is live and serving production traffic, while the other (Green) is idle. To deploy a new version, we deploy it to the idle environment (Green). Once tested and validated, traffic is switched from the Blue environment to the Green environment. The Blue environment then becomes the idle environment, ready for the next deployment.
Prerequisites and Setup on Linode
This guide assumes you have a Linode account and are comfortable with basic Linode management, including creating Compute Instances, managing DNS records, and setting up SSH access. We’ll need at least two identical Linode Compute Instances for our Blue and Green environments. For simplicity, we’ll use Ubuntu 22.04 LTS. We’ll also need a load balancer to direct traffic. Linode’s NodeBalancers are an excellent choice for this.
Let’s outline the initial setup:
- Two Compute Instances: Provision two Linode Compute Instances (e.g., `blue-app-01` and `green-app-01`). Ensure they have identical configurations, including installed software (Python, pip, virtual environment tools, etc.) and firewall rules.
- NodeBalancer: Create a Linode NodeBalancer. Configure it to point to the private IP addresses of your `blue-app-01` and `green-app-01` instances. Initially, the NodeBalancer will only route traffic to one of them (e.g., the Blue environment).
- DNS: Point your application’s domain name (e.g., `myapp.example.com`) to the IP address of the NodeBalancer.
- Application Deployment: Your Python application should be designed to run from a specific directory and be easily deployable. We’ll use a simple Flask application as an example.
Infrastructure as Code: Terraform Configuration
Managing infrastructure manually is error-prone. We’ll use Terraform to define our Linode resources. This ensures reproducibility and makes it easy to spin up or tear down environments.
Create a main.tf file with the following content:
First, define your Linode provider and API token.
terraform {
required_providers {
linode = {
source = "linode/linode"
version = "~> 1.0"
}
}
}
provider "linode" {
token = var.linode_api_token
}
variable "linode_api_token" {
description = "Linode API Token"
type = string
sensitive = true
}
variable "region" {
description = "Linode region"
type = string
default = "us-east"
}
variable "instance_type" {
description = "Linode instance type"
type = string
default = "g6-nanode-1"
}
variable "ssh_key_id" {
description = "Linode SSH Key ID"
type = string
}
variable "app_domain" {
description = "The domain name for your application"
type = string
}
variable "app_port" {
description = "The port your application listens on"
type = number
default = 5000
}
Next, define the Compute Instances for Blue and Green environments. We’ll use a user data script to install Docker and pull our application image.
resource "linode_instance" "blue_app" {
label = "blue-app-01"
image = "linode/ubuntu22.04"
region = var.region
type = var.instance_type
root_pass = random_password.root_password.result
authorized_keys = [
file("~/.ssh/id_rsa.pub") # Ensure your public key is here
]
user_data = templatefile("${path.module}/scripts/install_docker.sh.tpl", {
app_image_name = "my-python-app:latest" # Replace with your Docker image name
app_port = var.app_port
})
tags = ["blue", "app"]
}
resource "linode_instance" "green_app" {
label = "green-app-01"
image = "linode/ubuntu22.04"
region = var.region
type = var.instance_type
root_pass = random_password.root_password.result
authorized_keys = [
file("~/.ssh/id_rsa.pub") # Ensure your public key is here
]
user_data = templatefile("${path.module}/scripts/install_docker.sh.tpl", {
app_image_name = "my-python-app:latest" # Replace with your Docker image name
app_port = var.app_port
})
tags = ["green", "app"]
}
resource "random_password" "root_password" {
length = 16
special = true
}
Now, define the NodeBalancer. We’ll initially configure it to point only to the Blue instance.
resource "linode_nodebalancer" "app_lb" {
label = "app-nodebalancer"
region = var.region
client_conn_throttle = 100
# Initial configuration to point to Blue
nodebalancer_config {
protocol = "http"
port = 80
check = "http"
check_path = "/" # Your app's health check endpoint
nodes {
address = linode_instance.blue_app.ipv4[0]
port = var.app_port
weight = 100
sticky_sessions = false
}
}
}
resource "linode_domain" "app_domain" {
domain = var.app_domain
type = "master"
}
resource "linode_domain_record" "app_a_record" {
domain_id = linode_domain.app_domain.id
name = "@" # Root domain
type = "A"
target = linode_nodebalancer.app_lb.ipv4
ttl = 300
}
Create a scripts/install_docker.sh.tpl file:
#!/bin/bash
apt-get update -y
apt-get install -y docker.io docker-compose
systemctl start docker
systemctl enable docker
# Pull the application image
docker pull ${app_image_name}
# Run the application container
docker run -d --name app-${app_image_name} -p ${app_port}:${app_port} ${app_image_name}
To apply this configuration:
- Set your Linode API token as an environment variable:
export LINODE_API_TOKEN="YOUR_LINODE_API_TOKEN" - Initialize Terraform:
terraform init - Plan the deployment:
terraform plan -var="linode_api_token=YOUR_LINODE_API_TOKEN" -var="ssh_key_id=YOUR_SSH_KEY_ID" -var="app_domain=myapp.example.com" - Apply the configuration:
terraform apply -var="linode_api_token=YOUR_LINODE_API_TOKEN" -var="ssh_key_id=YOUR_SSH_KEY_ID" -var="app_domain=myapp.example.com"
Application Deployment Strategy
Our Python application will be containerized using Docker. This ensures consistency across environments. The user_data script in Terraform handles the initial Docker installation and container launch. For subsequent deployments, we’ll need a mechanism to update the Docker image and restart the container.
Let’s assume our application is a simple Flask app:
# app.py
from flask import Flask
import os
app = Flask(__name__)
@app.route('/')
def hello():
version = os.environ.get("APP_VERSION", "unknown")
return f"Hello from App Version: {version}!"
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
And a Dockerfile:
FROM python:3.9-slim WORKDIR /app COPY requirements.txt requirements.txt RUN pip install --no-cache-dir -r requirements.txt COPY . . # Set an environment variable for versioning ENV APP_VERSION="1.0.0" EXPOSE 5000 CMD ["python", "app.py"]
Build and push your Docker image to a registry (e.g., Docker Hub, Linode Container Registry). For this example, let’s assume the image is named my-python-app:latest.
Automating the Deployment Pipeline
We need a way to trigger deployments. A CI/CD pipeline is ideal. For this example, we’ll simulate a deployment process using SSH and Docker commands. In a real-world scenario, you’d integrate this with Jenkins, GitLab CI, GitHub Actions, or a similar tool.
The core of the zero-downtime strategy lies in how we switch traffic. This is managed by updating the NodeBalancer’s configuration.
Step-by-Step Deployment Workflow
Let’s define a script that orchestrates the deployment to the Green environment and then switches traffic.
Create a script, e.g., deploy.sh:
#!/bin/bash
# --- Configuration ---
BLUE_INSTANCE_IP=$(terraform output -raw blue_app_ipv4) # Assuming you output this
GREEN_INSTANCE_IP=$(terraform output -raw green_app_ipv4) # Assuming you output this
NODEBALANCER_ID=$(terraform output -raw app_nodebalancer_id) # Assuming you output this
LINODE_CLI_TOKEN="YOUR_LINODE_CLI_TOKEN" # Or use environment variable LINODE_CLI_TOKEN
APP_PORT=5000
NEW_APP_IMAGE="my-python-app:v1.1.0" # The new version to deploy
HEALTH_CHECK_URL="http://${NEW_APP_IMAGE_IP}:${APP_PORT}/" # Replace with actual health check
# --- Helper Functions ---
function run_remote_command() {
local ip=$1
local cmd=$2
ssh root@${ip} "${cmd}"
}
function update_nodebalancer_config() {
local target_ip=$1 # The IP of the instance to add/remove
local action=$2 # "add" or "remove"
local current_config_id=$(linode-cli nodebalancers configs list ${NODEBALANCER_ID} --json | jq -r '.[] | select(.port == 80 and .protocol == "http") | .id')
if [ "$action" == "add" ]; then
echo "Adding ${target_ip} to NodeBalancer ${NODEBALANCER_ID} config ${current_config_id}"
linode-cli nodebalancers configs nodes update ${NODEBALANCER_ID} ${current_config_id} \
--address ${target_ip} \
--port ${APP_PORT} \
--weight 100 \
--label "node-${target_ip}" \
--status "up"
elif [ "$action" == "remove" ]; then
echo "Removing ${target_ip} from NodeBalancer ${NODEBALANCER_ID} config ${current_config_id}"
linode-cli nodebalancers configs nodes delete ${NODEBALANCER_ID} ${current_config_id} --id $(linode-cli nodebalancers configs nodes list ${NODEBALANCER_ID} ${current_config_id} --json | jq -r ".[] | select(.address == \"${target_ip}\") | .id")
fi
}
function wait_for_health_check() {
local url=$1
local timeout=300 # seconds
local interval=5 # seconds
local elapsed=0
echo "Waiting for ${url} to be healthy..."
while [ $elapsed -lt $timeout ]; do
if curl -s --head ${url} | grep "200 OK" > /dev/null; then
echo "Health check successful!"
return 0
fi
sleep $interval
elapsed=$((elapsed + interval))
echo "Still waiting... (${elapsed}/${timeout}s)"
done
echo "Health check failed after ${timeout} seconds."
return 1
}
# --- Deployment Steps ---
echo "Starting deployment of ${NEW_APP_IMAGE}..."
# 1. Deploy new version to the Green environment
echo "Deploying new version to Green instance (${GREEN_INSTANCE_IP})..."
run_remote_command ${GREEN_INSTANCE_IP} "docker pull ${NEW_APP_IMAGE} && \
docker stop app-my-python-app && \
docker rm app-my-python-app && \
docker run -d --name app-my-python-app -p ${APP_PORT}:${APP_PORT} -e APP_VERSION=$(echo ${NEW_APP_IMAGE} | cut -d':' -f2) ${NEW_APP_IMAGE}"
if [ $? -ne 0 ]; then
echo "Failed to deploy to Green instance. Aborting."
exit 1
fi
echo "New version deployed to Green instance."
# 2. Add Green instance to NodeBalancer
# Note: This assumes the NodeBalancer is initially configured ONLY for Blue.
# We need to ensure the NodeBalancer config is updated to include Green.
# This requires more sophisticated NodeBalancer config management, potentially
# involving deleting and recreating configs or using the Linode API directly.
# For simplicity here, we'll assume we can add a node.
# A more robust approach would be to manage NodeBalancer configs via Terraform.
# Temporarily add Green to the NodeBalancer
update_nodebalancer_config ${GREEN_INSTANCE_IP} "add"
# 3. Wait for Green instance to become healthy
# We need the IP of the Green instance *as seen by the NodeBalancer*.
# This is usually the private IP.
GREEN_APP_NODEBALANCER_IP=$(linode-cli nodebalancers configs nodes list ${NODEBALANCER_ID} $(linode-cli nodebalancers configs list ${NODEBALANCER_ID} --json | jq -r '.[] | select(.port == 80 and .protocol == "http") | .id') --json | jq -r ".[] | select(.address == \"${GREEN_INSTANCE_IP}\") | .address")
if ! wait_for_health_check "http://${GREEN_APP_NODEBALANCER_IP}:${APP_PORT}/"; then
echo "Green instance is not healthy. Removing from NodeBalancer and aborting."
update_nodebalancer_config ${GREEN_INSTANCE_IP} "remove"
exit 1
fi
# 4. Switch traffic: Remove Blue instance from NodeBalancer
echo "Switching traffic: Removing Blue instance (${BLUE_INSTANCE_IP}) from NodeBalancer."
update_nodebalancer_config ${BLUE_INSTANCE_IP} "remove"
# 5. Verify traffic is now on Green
echo "Verifying traffic is now on Green instance."
# This step is crucial. You'd typically run checks against your public domain.
# For this script, we'll assume the wait_for_health_check on Green was sufficient.
# 6. Update Terraform state to reflect the new Blue environment (optional but good practice)
# This is complex. A simpler approach is to re-provision the Blue instance
# with the *old* version, or simply keep it as is until the next deployment.
# For this script, we'll skip re-provisioning Blue.
echo "Blue-Green deployment successful! Green is now live."
echo "The old Blue instance (${BLUE_INSTANCE_IP}) is now idle and can be updated for the next deployment."
# --- Cleanup (Optional) ---
# You might want to stop/remove the old Blue container on the idle instance
# run_remote_command ${BLUE_INSTANCE_IP} "docker stop app-my-python-app && docker rm app-my-python-app"
Important Notes on the Script:
- Terraform Outputs: The script assumes you have outputs defined in your Terraform configuration for instance IPs and NodeBalancer ID. Add these to your
main.tf:output "blue_app_ipv4" { value = linode_instance.blue_app.ipv4[0] } output "green_app_ipv4" { value = linode_instance.green_app.ipv4[0] } output "app_nodebalancer_id" { value = linode_nodebalancer.app_lb.id } - Linode CLI: This script uses the
linode-cli. Ensure it’s installed and configured with your API token (e.g., vialinode-cli configureor by setting theLINODE_CLI_TOKENenvironment variable). - jq: The script uses
jqfor JSON parsing. Install it withsudo apt-get install jq -y. - SSH Access: Ensure your SSH keys are set up correctly for passwordless SSH access to the Linode instances.
- NodeBalancer Configuration: Managing NodeBalancer configurations dynamically via script can be brittle. A more robust solution involves using Terraform to manage the NodeBalancer’s nodes, or using the Linode API directly for more complex updates. The provided script simplifies this by adding/removing nodes.
- Health Checks: A reliable health check endpoint in your application is critical. The
wait_for_health_checkfunction polls this endpoint. - Rollback: For a rollback, you would essentially repeat the process but switch traffic back to the Blue environment (which would now contain the older version).
Advanced Considerations and Enhancements
This setup provides a foundational blue-green deployment. For production readiness, consider these enhancements:
- Automated Testing: Integrate automated tests (unit, integration, end-to-end) that run against the Green environment before traffic is switched.
- Canary Releases: Instead of a full switch, gradually shift a small percentage of traffic to the Green environment, monitor, and then increase the percentage. This can be achieved with more advanced load balancers or by manipulating DNS TTLs.
- Database Migrations: Handling database schema changes requires careful planning. Strategies include backward-compatible changes, multi-phase migrations, or using tools like Alembic or Django’s migration system in conjunction with the blue-green deployment.
- Configuration Management: Ensure application configurations (environment variables, secrets) are managed consistently across both environments and are updated atomically during deployment.
- State Management: The Terraform state needs to accurately reflect the current production environment. If you re-provision the “old” Blue instance with the new version, you’ll need to update your Terraform state or use separate Terraform configurations for Blue and Green.
- Monitoring and Alerting: Implement robust monitoring for both environments and the NodeBalancer. Set up alerts for increased error rates, latency, or health check failures.
- Rollback Strategy: Define a clear and tested rollback procedure. This script’s `update_nodebalancer_config` function can be used to revert traffic.
- Immutable Infrastructure: Instead of updating containers on existing instances, consider destroying the old Blue instance and provisioning a new one with the updated application. This aligns better with immutable infrastructure principles.
By implementing blue-green deployments with infrastructure as code and robust automation, you can achieve near-zero downtime for your Python applications on Linode, significantly improving your release process and application availability.