Zero-Downtime Blue-Green Deployment Pipelines for Ruby Applications on AWS
Architectural Overview: Blue-Green Deployments on AWS
Implementing zero-downtime deployments for Ruby applications on AWS necessitates a robust strategy that decouples the deployment process from live traffic. The blue-green deployment model achieves this by maintaining two identical production environments: ‘Blue’ (current production) and ‘Green’ (new version). Traffic is initially directed to Blue. Once Green is fully deployed and tested, a traffic switch redirects all incoming requests to Green, leaving Blue idle. This allows for immediate rollback by simply switching traffic back to Blue if issues arise.
Our AWS infrastructure will leverage Elastic Load Balancing (ELB) for traffic management, Auto Scaling Groups (ASGs) for environment provisioning and scaling, and Amazon S3 for storing deployment artifacts. We’ll orchestrate these components using AWS CodeDeploy, which is purpose-built for managing application deployments to EC2 instances, Lambda, and ECS. For our Ruby application, we’ll assume a standard Rails setup that can be packaged as a deployment artifact.
Setting Up the AWS Infrastructure
The foundation of our blue-green strategy is two distinct, but identically configured, Auto Scaling Groups and their associated EC2 instances. These will be managed by a single Elastic Load Balancer. AWS CodeDeploy will be the orchestrator, directing traffic to the appropriate environment.
1. Elastic Load Balancer (ELB) Configuration
We’ll use an Application Load Balancer (ALB) for its advanced routing capabilities. The key to blue-green with ALB is its ability to manage listener rules and target groups dynamically. We’ll configure two target groups, one for the ‘Blue’ environment and one for the ‘Green’ environment. Initially, the ALB listener will forward traffic to the ‘Blue’ target group.
AWS CLI Example: Creating Target Groups
aws elbv2 create-target-group \
--name my-app-blue-tg \
--protocol HTTP \
--port 80 \
--vpc-id vpc-xxxxxxxxxxxxxxxxx \
--health-check-protocol HTTP \
--health-check-path /health \
--target-type instance
aws elbv2 create-target-group \
--name my-app-green-tg \
--protocol HTTP \
--port 80 \
--vpc-id vpc-xxxxxxxxxxxxxxxxx \
--health-check-protocol HTTP \
--health-check-path /health \
--target-type instance
AWS CLI Example: Creating ALB and Listener Rules
# Create ALB (assuming VPC and subnets are pre-configured)
aws elbv2 create-load-balancer \
--name my-app-alb \
--subnets subnet-xxxxxxxxxxxxxxxxx subnet-yyyyyyyyyyyyyyyyy \
--security-groups sg-zzzzzzzzzzzzzzzzz \
--scheme internet-facing
# Get ALB ARN
ALB_ARN=$(aws elbv2 describe-load-balancers --query "LoadBalancers[?DNSName=='your-alb-dns-name'].LoadBalancerArn" --output text)
# Create default listener forwarding to Blue
aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--port 80 \
--protocol HTTP \
--default-actions Type=forward,TargetGroupArn=$(aws elbv2 describe-target-groups --names my-app-blue-tg --query "TargetGroups[0].TargetGroupArn" --output text)
# Note: Listener rules for Green will be managed by CodeDeploy
2. Auto Scaling Groups (ASGs) and Launch Configurations
We need two ASGs, each associated with one of the target groups. The launch configurations for both ASGs will be identical, specifying the EC2 instance type, AMI, security groups, and importantly, the IAM role for CodeDeploy. The ASG for the ‘Green’ environment will initially be configured with a desired capacity of 0.
AWS CLI Example: Creating Launch Configurations
# Assuming you have a suitable AMI ID (ami-xxxxxxxxxxxxxxxxx) and security group (sg-zzzzzzzzzzzzzzzzz)
aws autoscaling create-launch-configuration \
--launch-configuration-name my-app-lc-blue \
--image-id ami-xxxxxxxxxxxxxxxxx \
--instance-type t3.medium \
--iam-instance-profile arn:aws:iam::123456789012:instance-profile/CodeDeployInstanceProfile \
--security-groups sg-zzzzzzzzzzzzzzzzz \
--user-data file://user-data.sh # Script to install dependencies and CodeDeploy agent
aws autoscaling create-launch-configuration \
--launch-configuration-name my-app-lc-green \
--image-id ami-xxxxxxxxxxxxxxxxx \
--instance-type t3.medium \
--iam-instance-profile arn:aws:iam::123456789012:instance-profile/CodeDeployInstanceProfile \
--security-groups sg-zzzzzzzzzzzzzzzzz \
--user-data file://user-data.sh
AWS CLI Example: Creating Auto Scaling Groups
# Blue ASG (initially handles all traffic)
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name my-app-asg-blue \
--launch-configuration-name my-app-lc-blue \
--min-size 2 \
--max-size 5 \
--desired-capacity 2 \
--vpc-zone-identifier "subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy" \
--target-group-arns $(aws elbv2 describe-target-groups --names my-app-blue-tg --query "TargetGroups[0].TargetGroupArn" --output text)
# Green ASG (initially idle)
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name my-app-asg-green \
--launch-configuration-name my-app-lc-green \
--min-size 0 \
--max-size 0 \
--desired-capacity 0 \
--vpc-zone-identifier "subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy" \
--target-group-arns $(aws elbv2 describe-target-groups --names my-app-green-tg --query "TargetGroups[0].TargetGroupArn" --output text)
3. IAM Role for CodeDeploy
The EC2 instances need permissions to communicate with CodeDeploy. This is achieved via an IAM instance profile. The CodeDeploy service itself also requires permissions to interact with ELB, ASG, and EC2.
IAM Policy for EC2 Instances (CodeDeployInstanceProfile)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::aws-codedeploy-us-east-1",
"arn:aws:s3:::your-codedeploy-bucket/*"
]
},
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetInstanceHealth",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeTags"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeTargetGroups",
"elasticloadbalancing:DescribeTargetGroupAttributes",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:DeregisterTargets",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:ModifyListener"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"codedeploy:GetDeployment",
"codedeploy:GetDeploymentInstance",
"codedeploy:ListDeploymentInstances",
"codedeploy:RegisterInstance",
"codedeploy:DeregisterInstance",
"codedeploy:PutLifecycleEventHookExecutionStatus"
],
"Resource": "*"
}
]
}
IAM Policy for CodeDeploy Service Role
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetInstanceHealth",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:DescribeTags"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeLoadBalancers",
"elasticloadbalancing:DescribeTargetGroups",
"elasticloadbalancing:DescribeTargetGroupAttributes",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:DeregisterTargets",
"elasticloadbalancing:DescribeListeners",
"elasticloadbalancing:ModifyListener"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"codedeploy:CreateDeployment",
"codedeploy:GetApplication",
"codedeploy:GetDeployment",
"codedeploy:GetDeploymentConfig",
"codedeploy:GetDeploymentGroup",
"codedeploy:ListDeploymentConfigs",
"codedeploy:ListDeploymentGroups",
"codedeploy:ListDeployments",
"codedeploy:ListGitHubAccountTokenGrants",
"codedeploy:RegisterApplicationRevision",
"codedeploy:RevokeGitHubAccountToken",
"codedeploy:UpdateDeploymentGroup"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"iam:CreateServiceLinkedRole",
"iam:GetRole",
"iam:ListAttachedRolePolicies",
"iam:ListRolePolicies",
"iam:PassRole"
],
"Resource": "arn:aws:iam::*:role/aws-service-role/codedeploy.amazonaws.com/AWSServiceRoleForCodeDeploy"
}
]
}
CodeDeploy Application and Deployment Group Configuration
AWS CodeDeploy is the central piece of our deployment automation. We’ll define an application and then create two deployment groups, one for ‘Blue’ and one for ‘Green’. The key here is how CodeDeploy interacts with the ASGs and ELB to manage the traffic shifting.
1. CodeDeploy Application Creation
First, create the CodeDeploy application. This is a logical container for your deployment configurations.
AWS CLI Example: Create CodeDeploy Application
aws deploy create-application \
--application-name my-ruby-app \
--compute-platform EC2 \
--description "Ruby on Rails application"
2. Deployment Group for ‘Blue’ Environment
This deployment group will initially be active and receive all traffic. It’s associated with the ‘Blue’ ASG and the ‘Blue’ target group.
AWS CLI Example: Create ‘Blue’ Deployment Group
BLUE_TG_ARN=$(aws elbv2 describe-target-groups --names my-app-blue-tg --query "TargetGroups[0].TargetGroupArn" --output text)
CODE_DEPLOY_SERVICE_ROLE_ARN="arn:aws:iam::123456789012:role/CodeDeployServiceRole" # Replace with your actual ARN
aws deploy create-deployment-group \
--application-name my-ruby-app \
--deployment-group-name my-app-blue \
--auto-scaling-groups my-app-asg-blue \
--service-role-arn $CODE_DEPLOY_SERVICE_ROLE_ARN \
--load-balancer-info TargetGroupInfoList=[{TargetGroupArn=$BLUE_TG_ARN,Name=my-app-blue-tg}] \
--deployment-style DeploymentType=BLUE_GREEN,DeploymentOption=WITH_TRAFFIC_CONTROL \
--blue-green-deployment-configuration Enabled=true,GreenFleetPromotionOption=MANUAL,TerminateBlueInstancesOnDeploymentSuccess={Action=TERMINATE,TerminationWaitTimeInMinutes=5} \
--trigger-configurations TriggerEventsInfo=[{TriggerEventType=DEPLOYMENT_STOP_ON_ALARM,Source=ALARM,TriggerName=MyCloudWatchAlarm}] \
--ec2-tag-filters Key=App,Value=MyApp,Type=KEY_AND_VALUE \
--deployment-ready-option WHEN_ALL_INSTANCES_ARE_HEALTHY \
--auto-rollback-configuration Enabled=true,Events=DEPLOYMENT_FAILURE,Alarms=MyCloudWatchAlarm
Explanation of Key Parameters:
--deployment-style DeploymentType=BLUE_GREEN,DeploymentOption=WITH_TRAFFIC_CONTROL: This is crucial for enabling the blue-green deployment strategy with traffic management.--blue-green-deployment-configuration Enabled=true,GreenFleetPromotionOption=MANUAL,TerminateBlueInstancesOnDeploymentSuccess={Action=TERMINATE,TerminationWaitTimeInMinutes=5}: Configures the blue-green behavior.MANUALpromotion means we’ll manually trigger the traffic switch.TERMINATEensures the old blue instances are cleaned up after a successful deployment.--deployment-ready-option WHEN_ALL_INSTANCES_ARE_HEALTHY: Ensures CodeDeploy waits for all instances in the new (green) fleet to be healthy before proceeding.--auto-rollback-configuration: Enables automatic rollback on deployment failure or alarm triggers.
3. Deployment Group for ‘Green’ Environment
This deployment group is configured identically to the ‘Blue’ one but points to the ‘Green’ ASG and ‘Green’ target group. It will be used for subsequent deployments.
AWS CLI Example: Create ‘Green’ Deployment Group
GREEN_TG_ARN=$(aws elbv2 describe-target-groups --names my-app-green-tg --query "TargetGroups[0].TargetGroupArn" --output text)
aws deploy create-deployment-group \
--application-name my-ruby-app \
--deployment-group-name my-app-green \
--auto-scaling-groups my-app-asg-green \
--service-role-arn $CODE_DEPLOY_SERVICE_ROLE_ARN \
--load-balancer-info TargetGroupInfoList=[{TargetGroupArn=$GREEN_TG_ARN,Name=my-app-green-tg}] \
--deployment-style DeploymentType=BLUE_GREEN,DeploymentOption=WITH_TRAFFIC_CONTROL \
--blue-green-deployment-configuration Enabled=true,GreenFleetPromotionOption=MANUAL,TerminateBlueInstancesOnDeploymentSuccess={Action=TERMINATE,TerminationWaitTimeInMinutes=5} \
--trigger-configurations TriggerEventsInfo=[{TriggerEventType=DEPLOYMENT_STOP_ON_ALARM,Source=ALARM,TriggerName=MyCloudWatchAlarm}] \
--ec2-tag-filters Key=App,Value=MyApp,Type=KEY_AND_VALUE \
--deployment-ready-option WHEN_ALL_INSTANCES_ARE_HEALTHY \
--auto-rollback-configuration Enabled=true,Events=DEPLOYMENT_FAILURE,Alarms=MyCloudWatchAlarm
Application Deployment Artifact and `appspec.yml`
The application artifact for CodeDeploy is typically a ZIP archive containing your application code and a crucial file: appspec.yml. This file defines the lifecycle hooks for CodeDeploy to execute on the instances.
1. `appspec.yml` Structure
For a Ruby on Rails application, the appspec.yml will orchestrate tasks like fetching code, installing dependencies, running migrations, and starting the application server (e.g., Puma).
version: 0.0
os: linux
files:
- source: /
destination: /var/www/my_ruby_app
hooks:
BeforeInstall:
- location: scripts/before_install.sh
timeout: 300
runas: root
AfterInstall:
- location: scripts/after_install.sh
timeout: 300
runas: ubuntu
ApplicationStart:
- location: scripts/application_start.sh
timeout: 300
runas: ubuntu
ValidateService:
- location: scripts/validate_service.sh
timeout: 60
runas: ubuntu
2. Example Deployment Scripts
These scripts are placed in a scripts/ directory within your application’s root, alongside appspec.yml.
scripts/before_install.sh
#!/bin/bash # Stop any existing application processes sudo pkill -f puma || true sudo pkill -f rails || true # Remove previous deployment sudo rm -rf /var/www/my_ruby_app sudo mkdir -p /var/www/my_ruby_app sudo chown ubuntu:ubuntu /var/www/my_ruby_app
scripts/after_install.sh
#!/bin/bash cd /var/www/my_ruby_app # Install dependencies bundle install --without development test --deployment # Run database migrations (only if this is the 'green' environment and it's a new deployment) # This requires careful handling to avoid running migrations on the 'blue' environment during a rollback. # A common pattern is to use CodeDeploy's deployment group tagging or environment variables. # For simplicity here, we assume migrations are handled carefully or are idempotent. # Consider using a separate migration deployment strategy for critical applications. RAILS_ENV=production bundle exec rails db:migrate # Precompile assets RAILS_ENV=production bundle exec rails assets:precompile
scripts/application_start.sh
#!/bin/bash cd /var/www/my_ruby_app # Start Puma server # Ensure your Puma configuration is set up to bind to 0.0.0.0 # and that the correct port (e.g., 8080) is used, which is then proxied by the ELB. RAILS_ENV=production bundle exec puma -C config/puma.rb
scripts/validate_service.sh
#!/bin/bash
# This script is executed by CodeDeploy to validate the deployment.
# It should perform checks to ensure the application is running correctly.
# A simple check is to curl the health endpoint.
HEALTH_CHECK_URL="http://localhost/health" # Or the internal IP of the instance
# Wait for the application to become available
for i in {1..60}; do
curl -s --fail $HEALTH_CHECK_URL >& /dev/null
if [ $? -eq 0 ]; then
echo "Application is healthy."
exit 0
fi
echo "Waiting for application to become healthy... ($i/60)"
sleep 1
done
echo "Application did not become healthy within the timeout period."
exit 1
Performing a Blue-Green Deployment
With the infrastructure and application artifact ready, we can initiate a deployment. This process involves creating a CodeDeploy deployment and then manually shifting traffic.
1. Creating the Deployment Package
Package your application code, appspec.yml, and the scripts/ directory into a ZIP file. Upload this ZIP file to an S3 bucket that CodeDeploy can access.
Example: Creating the ZIP archive
# Assuming you are in the root directory of your Rails application zip -r my-ruby-app-v1.0.0.zip appspec.yml scripts/ config/ public/ vendor/ Gemfile Gemfile.lock Rakefile config.ru
Example: Uploading to S3
aws s3 cp my-ruby-app-v1.0.0.zip s3://your-codedeploy-bucket/my-ruby-app/my-ruby-app-v1.0.0.zip
2. Initiating the CodeDeploy Deployment
We will deploy to the ‘Green’ deployment group. CodeDeploy will provision new instances for the ‘Green’ ASG, deploy the application to them, and run the lifecycle hooks. The ‘Blue’ ASG remains untouched and continues serving traffic.
AWS CLI Example: Create Deployment
aws deploy create-deployment \
--application-name my-ruby-app \
--deployment-group-name my-app-green \
--description "Deploying v1.0.0 to green environment" \
--revision '{"revisionType": "S3", "s3Location": {"bucket": "your-codedeploy-bucket", "key": "my-ruby-app/my-ruby-app-v1.0.0.zip", "bundleType": "zip"}}' \
--ignore-application-stop-failures \
--ignore-traffic-manipulation-failures
3. Monitoring the Deployment
Monitor the deployment progress in the AWS CodeDeploy console. You’ll see new instances being launched in the ‘Green’ ASG. Once the ValidateService hook passes, the instances in the ‘Green’ ASG will be registered with the ‘Green’ target group.
4. Shifting Traffic to ‘Green’
Once the ‘Green’ deployment is successful and validated, you need to manually shift traffic. This is done by updating the ALB listener rules. CodeDeploy can automate this if configured, but manual control offers an extra layer of safety.
AWS CLI Example: Get Listener ARN
LISTENER_ARN=$(aws elbv2 describe-listeners --load-balancer-arn $ALB_ARN --query "Listeners[0].ListenerArn" --output text)
AWS CLI Example: Update Listener Rule to Forward to Green
GREEN_TG_ARN=$(aws elbv2 describe-target-groups --names my-app-green-tg --query "TargetGroups[0].TargetGroupArn" --output text)
# First, get the existing default rule ARN
DEFAULT_RULE_ARN=$(aws elbv2 describe-rules --listener-arn $LISTENER_ARN --query "Rules[?Type=='default'].RuleArn" --output text)
# Modify the default rule to forward to the Green target group
aws elbv2 modify-rule \
--rule-arn $DEFAULT_RULE_ARN \
--actions Value=[{Type=forward,TargetGroupArn=$GREEN_TG_ARN}]
After this command, all new traffic will be directed to the ‘Green’ environment. The ‘Blue’ environment instances are still running but receiving no traffic. You can monitor traffic on the ALB and application metrics.
5. Terminating ‘Blue’ Instances
Once you are confident that the ‘Green’ deployment is stable, you can terminate the ‘Blue’ instances. CodeDeploy’s TerminateBlueInstancesOnDeploymentSuccess setting handles this. If you set GreenFleetPromotionOption to MANUAL, you might need to explicitly trigger the termination or let CodeDeploy handle it based on your configuration. If you chose AUTOMATIC, this step would be part of the CodeDeploy process.
If you need to manually trigger termination after a successful traffic shift:
# This command is typically part of the CodeDeploy workflow for BLUE_GREEN deployments
# If not automated, you might need to manually scale down the Blue ASG or rely on CodeDeploy's lifecycle hooks.
# For manual intervention, you would typically update the 'Blue' ASG to have 0 desired capacity.
aws autoscaling set-desired-capacity \
--auto-scaling-group-name my-app-asg-blue \
--desired-capacity 0 \
--honor-cooldown
Rollback Strategy
The primary advantage of blue-green deployments is the ease of rollback. If issues are detected in the ‘Green’ environment after the traffic shift, you simply reverse the traffic routing.
1. Reverting Traffic to ‘Blue’
To roll back, you reconfigure the ALB listener to forward traffic back to the ‘Blue’ target group. The ‘Green’ instances will then be terminated by CodeDeploy (if configured) or manually scaled down.
AWS CLI Example: Revert Listener Rule to Blue
BLUE_TG_ARN=$(aws elbv2 describe-target-groups --names my-app-blue-tg --query "TargetGroups[0].TargetGroupArn" --output text)
DEFAULT_RULE_ARN=$(aws elbv2 describe-rules --listener-arn $LISTENER_ARN --query "Rules[?Type=='default'].RuleArn" --output text)
aws elbv2 modify-rule \
--rule-arn $DEFAULT_RULE_ARN \
--actions Value=[{Type=forward,TargetGroupArn=$BLUE_TG_ARN}]
After reverting traffic, the ‘Green’ ASG can be scaled down to zero, and its instances terminated. The ‘Blue’ ASG will then be ready to receive traffic again.
Advanced Considerations and Best Practices
While the core blue-green strategy is robust, several advanced considerations can enhance its effectiveness and reliability.
1. Database Migrations
Database schema changes are the most challenging aspect of zero-downtime deployments. Migrations must be backward-compatible. A common approach is a multi-step process:
- Deploy code that adds new database columns or tables (without removing old ones).
- Deploy code that uses the new columns/tables.
- Deploy code that removes the old columns/tables.
This requires careful coordination between application code deployments and database schema changes, often managed outside the standard CodeDeploy flow or with custom hooks.
2. Session Management
If your application relies on in-memory sessions, a traffic switch can cause users to lose their sessions. Implement a shared session store like Redis or Memcached, or use cookie-based sessions with proper encryption.
3. Canary Deployments
For even greater safety, consider a canary deployment strategy within the blue-green framework. Instead of shifting 100% of traffic at once, gradually shift a small percentage (e.g., 5%) to the ‘Green’ environment. Monitor closely, and if all metrics are good, increase the percentage incrementally.
This can be achieved by configuring the ALB listener rules with weighted target groups, or by using AWS Route 53 weighted routing before the ALB.
4. CI/CD Integration
Integrate this blue-green deployment process into your CI/CD pipeline (e.g., Jenkins, GitLab CI, AWS CodePipeline). The pipeline should automate:
- Building the application artifact.
- Uploading to S3.
- Triggering the CodeDeploy deployment.
- Monitoring the deployment status.
- Manually approving the traffic shift (or automating it based on metrics).
- Performing the traffic shift.
- Monitoring post-shift stability.
- Automating rollback if necessary.
AWS CodePipeline can orchestrate many of these steps, integrating with CodeBuild for artifact creation and CodeDeploy for deployment execution.