Building a High-Availability, Cost-Optimized Ruby Stack on Linode

Leveraging Linode for a Resilient and Economical Ruby Deployment

For CTOs and VPs of Engineering tasked with balancing performance, availability, and budget, deploying a Ruby application stack on cloud infrastructure demands a strategic approach. Linode, with its transparent pricing and robust feature set, offers a compelling platform. This post details a production-ready, high-availability configuration for a Ruby on Rails application, emphasizing cost optimization through intelligent resource allocation and open-source tooling.

Core Stack Components and Architecture

Our target architecture prioritizes redundancy and scalability while minimizing unnecessary overhead. We’ll employ a multi-server setup:

Web Servers (Nginx): Two instances acting as reverse proxies and serving static assets.
Application Servers (Puma): Two instances, each running multiple worker processes to handle Ruby code execution.
Database Server (PostgreSQL): A dedicated, managed instance for data persistence.
Load Balancer (HAProxy): A single, highly available instance distributing traffic across web servers.

This tiered approach allows for independent scaling of components and provides failover capabilities at critical layers.

Database Layer: PostgreSQL on Linode Managed Databases

For production Ruby applications, a robust relational database is non-negotiable. Linode’s Managed Databases for PostgreSQL offer a significant advantage in terms of operational overhead and cost-effectiveness compared to self-managing a cluster. They provide automated backups, point-in-time recovery, and high availability without requiring deep PostgreSQL expertise.

Cost Optimization Tip: Select a database instance size that closely matches your current workload. Monitor performance metrics and scale up only when necessary. Linode’s pricing is predictable, making budget forecasting easier.

Application Servers: Puma Cluster Mode on Separate Instances

We’ll deploy two separate Linode instances for our Ruby application servers. Each instance will run Puma in cluster mode, allowing it to manage multiple worker processes and threads efficiently. This provides both load distribution within a single application server and failover if one instance becomes unavailable.

Puma Configuration (`config/puma.rb`)

A typical `config/puma.rb` for this setup would look like this:

# config/puma.rb
require 'dotenv/load'

# Set the environment
environment ENV.fetch('RAILS_ENV') { 'production' }

# Number of workers to spawn.
# For a 4-core CPU, 2 workers are often a good starting point.
workers ENV.fetch('WEB_CONCURRENCY') { 2 }.to_i

# Number of threads per worker.
# Adjust based on your application's I/O bound vs CPU bound nature.
threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }.to_i
threads threads_count, threads_count

# Bind to a TCP socket for communication with Nginx.
# Use a Unix socket if Nginx and Puma are on the same host, but for
# separate instances, TCP is necessary.
bind "tcp://0.0.0.0:#{ENV.fetch('PORT') { 3000 }}"

# Set the path to the PID file.
pidfile ENV.fetch('PIDFILE') { 'tmp/pids/server.pid' }

# Set the path to the log file.
logfile ENV.fetch('LOGFILE') { 'log/puma.log' }

# Activate the "preload_app!" option to reduce the time it takes for
# workers to boot. This is crucial for zero-downtime deployments.
preload_app!

# Allow Puma to be restarted by `rails restart` command.
plugin :tmp_restart

# Callbacks for deployment hooks
on_worker_boot do
  # Worker specific setup for Rails.
  ActiveRecord::Base.establish_connection if defined?(ActiveRecord::Base)
  # e.g., Redis.current = Redis.new(...)
end

# If using a plugin like 'rack-timeout', configure it here.
# plugin :rack_timeout

Explanation:

workers: Set to 2 per application server instance. This means each Linode will run 2 Puma worker processes.
threads: Configured to allow for concurrent requests within each worker.
bind: Crucially, this binds to a TCP port. This allows Nginx on separate servers to communicate with Puma.
preload_app!: Essential for zero-downtime deployments. It loads the application code before forking worker processes, significantly reducing boot times.

Systemd Service for Puma

To ensure Puma starts on boot and can be managed easily, we’ll use systemd. Create a service file (e.g., `/etc/systemd/system/myapp.service`):

[Unit]
Description=My App Puma Server
After=network.target

[Service]
Type=simple
User=deploy
Group=deploy
WorkingDirectory=/home/deploy/my_app
Environment="RAILS_ENV=production"
Environment="PORT=3000"
Environment="PIDFILE=/home/deploy/my_app/tmp/pids/server.pid"
Environment="LOGFILE=/home/deploy/my_app/log/puma.log"
# Adjust WEB_CONCURRENCY and RAILS_MAX_THREADS based on Linode instance size
Environment="WEB_CONCURRENCY=2"
Environment="RAILS_MAX_THREADS=5"
ExecStart=/usr/local/bin/bundle exec puma -C /home/deploy/my_app/config/puma.rb
ExecStop=/bin/kill -s TERM $MAINPID
Restart=on-failure

[Install]
WantedBy=multi-user.target

Commands:

sudo systemctl daemon-reload
sudo systemctl enable myapp.service
sudo systemctl start myapp.service
sudo systemctl status myapp.service

Cost Optimization: By running two application servers, we gain redundancy. If one fails, the other continues to serve traffic. We can also choose smaller, less expensive Linode instances for these servers and scale horizontally by adding more instances if needed, rather than paying for a single, massive, and potentially underutilized instance.

Web Servers: Nginx as a Reverse Proxy

Two Nginx instances will sit in front of the Puma application servers. They will handle SSL termination, serve static assets directly (reducing load on Puma), and proxy dynamic requests to the appropriate Puma instance.

Nginx Configuration

Create a configuration file for your application (e.g., `/etc/nginx/sites-available/myapp`):

upstream puma_backend {
    # Health check for the first app server
    server 192.168.1.10:3000; # Replace with actual IP of App Server 1
    server 192.168.1.11:3000; # Replace with actual IP of App Server 2
    # Add more app servers if you scale horizontally
}

server {
    listen 80;
    server_name your_domain.com www.your_domain.com;

    # Redirect HTTP to HTTPS
    location / {
        return 301 https://$host$request_uri;
    }
}

server {
    listen 443 ssl http2;
    server_name your_domain.com www.your_domain.com;

    # SSL Configuration (using Let's Encrypt/Certbot)
    ssl_certificate /etc/letsencrypt/live/your_domain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/your_domain.com/privkey.pem;
    include /etc/letsencrypt/options-ssl-nginx.conf;
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;

    # Serve static assets directly
    root /home/deploy/my_app/public;
    location ~ ^/(assets|packs)/ {
        gzip_static on;
        expires max;
        add_header Cache-Control public;
    }

    # Proxy dynamic requests to Puma
    location / {
        proxy_pass http://puma_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 300s; # Increase timeout for long-running requests
        proxy_connect_timeout 75s;
    }

    # Optional: Handle WebSocket connections
    # if ($http_upgrade = "websocket") {
    #     proxy_set_header Upgrade $http_upgrade;
    #     proxy_set_header Connection "upgrade";
    # }

    access_log /var/log/nginx/myapp.access.log;
    error_log /var/log/nginx/myapp.error.log;
}

Explanation:

upstream puma_backend: Defines a group of backend servers (your Puma instances). Nginx will round-robin requests to these servers.
proxy_pass http://puma_backend;: Forwards requests to the upstream group.
proxy_set_header: Passes important client information to the backend application.
ssl_certificate and ssl_certificate_key: Paths to your SSL certificates. Certbot is recommended for automated Let’s Encrypt certificate management.
root and location ~ ^/(assets|packs)/: Directs Nginx to serve static files from the application’s public directory, bypassing Puma.

Commands:

sudo ln -s /etc/nginx/sites-available/myapp /etc/nginx/sites-enabled/
sudo nginx -t # Test configuration
sudo systemctl restart nginx

Load Balancer: HAProxy for High Availability

To achieve true high availability for the web tier, we need a load balancer. While Linode offers managed load balancers, deploying HAProxy on a dedicated, small Linode instance provides a cost-effective and highly configurable solution. This HAProxy instance will distribute traffic to the two Nginx web servers.

HAProxy Configuration (`/etc/haproxy/haproxy.cfg`)

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend http_frontend
    bind *:80
    mode http
    # Redirect all HTTP traffic to HTTPS
    redirect scheme https if !{ ssl_fc }

frontend https_frontend
    bind *:443 ssl crt /etc/ssl/private/your_domain.com.pem # Combine your cert and key here
    mode http
    # Use stick-tables for session persistence if needed (e.g., for shopping carts)
    # stick-table type ip size 100k expire 30m store gpc0
    # stick on src

    # Health check for Nginx servers
    acl is_healthy nbsrv(nginx_backend) gt 0
    http-request deny if !is_healthy # Optional: deny if all backends are down

    default_backend nginx_backend

backend nginx_backend
    mode http
    balance roundrobin
    option httpchk GET / HTTP/1.1\r\nHost:\ your_domain.com # Basic HTTP health check
    http-check expect status 200 # Expect a 200 OK from health check

    # Replace with actual IPs of your Nginx servers
    server nginx1 192.168.1.20:80 check
    server nginx2 192.168.1.21:80 check
    # Add more Nginx servers if you scale horizontally

Explanation:

frontend http_frontend and frontend https_frontend: Define listeners for HTTP and HTTPS traffic. The HTTP frontend redirects to HTTPS.
bind *:443 ssl crt ...: Configures SSL termination on HAProxy. You’ll need to combine your certificate and private key into a single file for HAProxy.
backend nginx_backend: Defines the pool of Nginx servers.
balance roundrobin: Distributes traffic evenly.
option httpchk: Configures HAProxy to periodically send an HTTP request to the backend servers to check their health.
server ... check: Marks the backend servers and enables health checking.

Commands:

sudo systemctl enable haproxy
sudo systemctl start haproxy
sudo systemctl status haproxy

Cost Optimization: A single, small HAProxy instance is sufficient for load balancing. This is significantly cheaper than managed load balancers for many use cases. The health checks ensure that traffic is only sent to healthy Nginx instances, maintaining availability.

Deployment and Zero-Downtime Strategy

Achieving zero-downtime deployments is critical. The architecture supports this:

Deploy to one App Server at a time:

Update the code on one application server.
Run database migrations (carefully, ensuring backward compatibility).
Restart Puma on that server.
Monitor logs and application health.
Once confirmed healthy, repeat the process for the second application server.

Load Balancer Health Checks: HAProxy will automatically stop sending traffic to an unhealthy Nginx server during the deployment process. Nginx, in turn, will stop sending traffic to an unhealthy Puma instance.
Database Migrations: Always ensure your migrations are backward-compatible. This means new code can run with the old database schema, and old code can run with the new schema. This allows you to deploy code changes incrementally.

Monitoring and Alerting

Robust monitoring is key to maintaining high availability and identifying cost-saving opportunities. Key metrics to track include:

CPU and Memory Usage: On all servers (HAProxy, Nginx, Puma, Database).
Network Traffic: Especially on the HAProxy and Nginx instances.
Request Latency: Tracked at the Nginx and HAProxy levels.
Error Rates: Monitor Nginx error logs and application logs for exceptions.
Database Performance: Query times, connection counts, and disk I/O.

Tools like Prometheus with Grafana, or Linode’s built-in monitoring, can be leveraged. Set up alerts for critical thresholds (e.g., high CPU, elevated error rates, unresponsive servers).

Cost Optimization Summary

This architecture optimizes costs by:

Horizontal Scaling: Using multiple smaller, cheaper Linode instances instead of a few large ones.
Managed Database: Offloading database administration to Linode’s managed service, reducing operational overhead.
Efficient Web Serving: Nginx serving static assets directly, reducing load on application servers.
Open-Source Tooling: Relying on battle-tested open-source software like Nginx, Puma, and HAProxy.
Predictable Pricing: Linode’s transparent pricing model allows for accurate budget forecasting.

By carefully selecting instance sizes, leveraging redundancy, and implementing efficient configurations, you can build a highly available and cost-effective Ruby stack on Linode that scales with your application’s needs.

Building a High-Availability, Cost-Optimized Ruby Stack on Linode

Leveraging Linode for a Resilient and Economical Ruby Deployment

Core Stack Components and Architecture

Database Layer: PostgreSQL on Linode Managed Databases

Application Servers: Puma Cluster Mode on Separate Instances

Puma Configuration (`config/puma.rb`)

Systemd Service for Puma

Web Servers: Nginx as a Reverse Proxy

Nginx Configuration

Load Balancer: HAProxy for High Availability

HAProxy Configuration (`/etc/haproxy/haproxy.cfg`)

Deployment and Zero-Downtime Strategy

Monitoring and Alerting

Cost Optimization Summary

Recent Posts

Top Categories

Our Products

Our Services