Disaster Recovery 101: Architecting Auto-Failovers for Redis and Ruby Deployments on DigitalOcean
Establishing a Redis Sentinel Cluster for High Availability
For robust Redis deployments, a single instance is a single point of failure. Implementing Redis Sentinel provides automatic failover and high availability. This section details the setup of a three-node Sentinel cluster on DigitalOcean Droplets, ensuring quorum and resilience.
We’ll assume you have three Droplets provisioned, each running Ubuntu 22.04 LTS. For simplicity, we’ll use private IP addresses for inter-node communication. Ensure your firewall rules allow traffic on port 26379 (Sentinel) and 6379 (Redis) between these Droplets.
Sentinel Configuration (`sentinel.conf`)
On each Sentinel node, create or modify the sentinel.conf file. The key parameters are:
port 26379: The default Sentinel port.sentinel monitor mymaster <master-ip> 6379 2: This is the core directive.mymasteris the arbitrary name for your Redis master.<master-ip>should be the private IP of your primary Redis instance.6379is the Redis port.2is the quorum – the number of Sentinels that must agree a master is down before initiating a failover. For a 3-node cluster, a quorum of 2 is appropriate.sentinel down-after-milliseconds mymaster 5000: The time in milliseconds a Sentinel must wait without receiving a reply from a Redis instance before marking it as “Subjectively Down” (SDOWN).sentinel failover-timeout mymaster 10000: The maximum time in milliseconds allowed for a failover to complete.sentinel parallel-syncs mymaster 1: The number of replicas that can be reconfigured to sync with the new master simultaneously during a failover.
Here’s an example configuration for Sentinel Node 1, assuming your master Redis is on 10.10.0.5:
# sentinel.conf on Sentinel Node 1 (e.g., 10.10.0.6) port 26379 sentinel monitor mymaster 10.10.0.5 6379 2 sentinel down-after-milliseconds mymaster 5000 sentinel failover-timeout mymaster 10000 sentinel parallel-syncs mymaster 1 # If you have replicas, add them here for monitoring # sentinel can-failover-timeout mymaster 60000 # sentinel auth-pass mymaster YourRedisPassword
Repeat this configuration on Sentinel Node 2 (e.g., 10.10.0.7) and Sentinel Node 3 (e.g., 10.10.0.8), updating the <master-ip> to point to your primary Redis instance. If your Redis instances are configured with a password, uncomment and set sentinel auth-pass mymaster YourRedisPassword on all Sentinel nodes.
Starting Redis and Sentinel Services
First, ensure Redis is installed and configured to run as a service. The default redis.conf is usually sufficient for basic setup, but ensure it’s bound to the correct network interface (e.g., bind 0.0.0.0 or your Droplet’s private IP) and that persistence is enabled (e.g., appendonly yes).
On your primary Redis Droplet (e.g., 10.10.0.5):
sudo systemctl start redis-server sudo systemctl enable redis-server
On each Sentinel Droplet (e.g., 10.10.0.6, 10.10.0.7, 10.10.0.8):
sudo systemctl start redis-sentinel sudo systemctl enable redis-sentinel
Verify the status of the Sentinel service:
sudo systemctl status redis-sentinel
You should see output indicating the Sentinel is running and has connected to other Sentinels. After a short period, you can check the master’s status from any Sentinel:
redis-cli -p 26379 SENTINEL master mymaster
This command will return details about the master, including its IP, port, number of replicas, and the current leader Sentinel.
Integrating Ruby Applications with Redis Sentinel
Your Ruby application needs to be aware of the Sentinel cluster to connect to the current Redis master. The redis-rb gem provides excellent support for this. Instead of connecting directly to a single Redis instance, you configure it to use Sentinel.
Gemfile Configuration
Ensure you have the redis gem in your Gemfile:
# Gemfile gem 'redis'
Run bundle install to install it.
Redis Client Initialization
In your application’s initialization code (e.g., an initializer in Rails, or a central configuration file in Sinatra/other frameworks), configure the Redis client to use Sentinel:
# config/initializers/redis.rb (Rails example)
# Define your Sentinel nodes and master name
sentinel_hosts = [
{ host: '10.10.0.6', port: 26379 }, # Sentinel Node 1 Private IP
{ host: '10.10.0.7', port: 26379 }, # Sentinel Node 2 Private IP
{ host: '10.10.0.8', port: 26379 } # Sentinel Node 3 Private IP
]
redis_master_name = 'mymaster' # Must match sentinel.conf
# Initialize the Redis client using Sentinel
# The `redis-rb` gem will automatically discover the current master
# through the Sentinel cluster.
$redis = Redis.new(
role: 'master', # Explicitly state we want the master
sentinels: sentinel_hosts,
master_name: redis_master_name,
# If your Redis requires a password:
# password: 'YourRedisPassword',
# If your Redis is not on default port 6379:
# port: 6379
)
# Optional: Verify connection and role
begin
puts "Connecting to Redis master: #{$redis.client.host}:#{$redis.client.port}"
puts "Redis role: #{$redis.role}"
rescue Redis::CannotConnectError => e
Rails.logger.error "Failed to connect to Redis: #{e.message}"
# Handle connection error appropriately, e.g., retry or alert
end
The redis-rb gem, when configured with sentinels and master_name, will query the Sentinel cluster to discover the current master’s address. It will automatically reconnect and re-discover the master if a failover occurs. The role: 'master' option is crucial for ensuring the client connects to the master, not a replica.
Testing Failover
To simulate a failover, you can manually stop the primary Redis instance. On the primary Redis Droplet:
sudo systemctl stop redis-server
Observe the logs on your Sentinel nodes. You should see Sentinels detecting the master as down, electing a leader Sentinel, and promoting a replica (if configured) to become the new master. Your Ruby application, upon its next Redis operation that requires a connection, will query the Sentinels, discover the new master, and reconnect.
You can verify the new master by running redis-cli -p 26379 SENTINEL master mymaster on any Sentinel node. The output should reflect the new master’s IP address.
Architecting for Resilience: HAProxy for Application Load Balancing
While Redis Sentinel handles Redis failover, your application servers might also experience issues. A robust architecture often involves load balancing for the application layer itself. HAProxy is an excellent choice for this, providing high availability and load balancing for your Ruby web applications (e.g., Rails, Sinatra).
HAProxy Configuration (`haproxy.cfg`)
We’ll set up HAProxy to distribute traffic across multiple application server instances. For true high availability of HAProxy itself, you would typically run two HAProxy instances in an active/passive or active/active setup using Keepalived or similar. For this example, we focus on a single HAProxy instance acting as a load balancer for your Ruby app servers.
Assume you have at least two Ruby application servers (e.g., Puma/Unicorn) running on Droplets 10.10.0.10 and 10.10.0.11, listening on port 3000.
# /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend http_frontend
bind *:80
# Use the Redis client configuration from the previous section
# Ensure your application connects to the Redis instance via the
# application's configuration, not directly from HAProxy to Redis.
default_backend http_backend
backend http_backend
balance roundrobin
option httpchk GET /health # Health check endpoint on your app
http-request set-header X-Forwarded-Port %[dst_port]
http-request add-header X-Forwarded-Proto https if { ssl_fc }
server app1 10.10.0.10:3000 check
server app2 10.10.0.11:3000 check
# Add more app servers as needed
# server app3 10.10.0.12:3000 check
In this configuration:
- The
globalsection sets up logging and daemonization. - The
defaultssection defines common settings for all frontends and backends, including timeouts and error files. - The
frontend http_frontendlistens on port 80 and directs all incoming HTTP traffic to thehttp_backend. - The
backend http_backenduses theroundrobinalgorithm to distribute requests. option httpchk GET /healthconfigures HAProxy to periodically send a GET request to the/healthendpoint on each application server. If a server fails this check, HAProxy will temporarily remove it from the pool of active servers.server app1 10.10.0.10:3000 checkdefines an application server instance. Thecheckdirective enables health checking.
Install HAProxy:
sudo apt update sudo apt install haproxy -y
After configuring /etc/haproxy/haproxy.cfg, restart HAProxy:
sudo systemctl restart haproxy sudo systemctl enable haproxy
Ensure your DigitalOcean firewall allows inbound traffic on port 80 to your HAProxy Droplet.
Application Health Check Endpoint
Your Ruby application needs a simple endpoint that HAProxy can query to determine its health. This endpoint should check critical dependencies, most importantly, the connection to Redis.
# routes/health.rb (Sinatra example)
get '/health' do
begin
# Check Redis connection
$redis.ping # Or any other simple Redis command
status 200
body 'OK'
rescue Redis::CannotConnectError, Redis::TimeoutError => e
logger.error "Health check failed: Redis connection error - #{e.message}"
status 503 # Service Unavailable
body 'Redis connection error'
rescue => e
logger.error "Health check failed: Unexpected error - #{e.message}"
status 500 # Internal Server Error
body 'Internal server error'
end
end
# For Rails, you might create a controller and route:
# app/controllers/health_controller.rb
# class HealthController < ApplicationController
# def show
# begin
# $redis.ping
# render json: { status: 'OK' }, status: :ok
# rescue Redis::CannotConnectError, Redis::TimeoutError => e
# render json: { error: 'Redis connection error' }, status: :service_unavailable
# rescue => e
# render json: { error: 'Internal server error' }, status: :internal_server_error
# end
# end
# end
#
# config/routes.rb
# get '/health', to: 'health#show'
When an application server fails its health check (e.g., due to Redis unavailability or an internal error), HAProxy will stop sending traffic to it. Once the server recovers and passes health checks again, HAProxy will automatically re-add it to the rotation.
Automated Failover Strategy Summary
This architecture provides a multi-layered approach to automated failover:
- Redis High Availability: Redis Sentinel monitors the Redis master and automatically promotes a replica if the master becomes unavailable. Your Ruby application, configured to use Sentinel, seamlessly reconnects to the new master.
- Application High Availability: HAProxy load balances incoming traffic across multiple application server instances. Its health checking mechanism detects unresponsive or unhealthy application servers and temporarily removes them from rotation, ensuring traffic is only sent to healthy instances.
By combining these technologies, you create a resilient system where failures in individual components (Redis master, application server instance) are automatically handled with minimal or no downtime for your users. The key is the correct configuration of Sentinel for Redis and the health check integration within HAProxy for your application layer.