Building a High-Availability, Cost-Optimized Ruby Stack on Linode
Leveraging Linode for a Resilient and Economical Ruby Deployment
For CTOs and VPs of Engineering tasked with balancing performance, availability, and budget, deploying a Ruby application stack on cloud infrastructure demands a strategic approach. Linode, with its transparent pricing and robust feature set, offers a compelling platform. This post details a production-ready, high-availability configuration for a Ruby on Rails application, emphasizing cost optimization through intelligent resource allocation and open-source tooling.
Core Stack Components and Architecture
Our target architecture prioritizes redundancy and scalability while minimizing unnecessary overhead. We’ll employ a multi-server setup:
- Web Servers (Nginx): Two instances acting as reverse proxies and serving static assets.
- Application Servers (Puma): Two instances, each running multiple worker processes to handle Ruby code execution.
- Database Server (PostgreSQL): A dedicated, managed instance for data persistence.
- Load Balancer (HAProxy): A single, highly available instance distributing traffic across web servers.
This tiered approach allows for independent scaling of components and provides failover capabilities at critical layers.
Database Layer: PostgreSQL on Linode Managed Databases
For production Ruby applications, a robust relational database is non-negotiable. Linode’s Managed Databases for PostgreSQL offer a significant advantage in terms of operational overhead and cost-effectiveness compared to self-managing a cluster. They provide automated backups, point-in-time recovery, and high availability without requiring deep PostgreSQL expertise.
Cost Optimization Tip: Select a database instance size that closely matches your current workload. Monitor performance metrics and scale up only when necessary. Linode’s pricing is predictable, making budget forecasting easier.
Application Servers: Puma Cluster Mode on Separate Instances
We’ll deploy two separate Linode instances for our Ruby application servers. Each instance will run Puma in cluster mode, allowing it to manage multiple worker processes and threads efficiently. This provides both load distribution within a single application server and failover if one instance becomes unavailable.
Puma Configuration (`config/puma.rb`)
A typical `config/puma.rb` for this setup would look like this:
# config/puma.rb
require 'dotenv/load'
# Set the environment
environment ENV.fetch('RAILS_ENV') { 'production' }
# Number of workers to spawn.
# For a 4-core CPU, 2 workers are often a good starting point.
workers ENV.fetch('WEB_CONCURRENCY') { 2 }.to_i
# Number of threads per worker.
# Adjust based on your application's I/O bound vs CPU bound nature.
threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }.to_i
threads threads_count, threads_count
# Bind to a TCP socket for communication with Nginx.
# Use a Unix socket if Nginx and Puma are on the same host, but for
# separate instances, TCP is necessary.
bind "tcp://0.0.0.0:#{ENV.fetch('PORT') { 3000 }}"
# Set the path to the PID file.
pidfile ENV.fetch('PIDFILE') { 'tmp/pids/server.pid' }
# Set the path to the log file.
logfile ENV.fetch('LOGFILE') { 'log/puma.log' }
# Activate the "preload_app!" option to reduce the time it takes for
# workers to boot. This is crucial for zero-downtime deployments.
preload_app!
# Allow Puma to be restarted by `rails restart` command.
plugin :tmp_restart
# Callbacks for deployment hooks
on_worker_boot do
# Worker specific setup for Rails.
ActiveRecord::Base.establish_connection if defined?(ActiveRecord::Base)
# e.g., Redis.current = Redis.new(...)
end
# If using a plugin like 'rack-timeout', configure it here.
# plugin :rack_timeout
Explanation:
workers: Set to 2 per application server instance. This means each Linode will run 2 Puma worker processes.threads: Configured to allow for concurrent requests within each worker.bind: Crucially, this binds to a TCP port. This allows Nginx on separate servers to communicate with Puma.preload_app!: Essential for zero-downtime deployments. It loads the application code before forking worker processes, significantly reducing boot times.
Systemd Service for Puma
To ensure Puma starts on boot and can be managed easily, we’ll use systemd. Create a service file (e.g., `/etc/systemd/system/myapp.service`):
[Unit] Description=My App Puma Server After=network.target [Service] Type=simple User=deploy Group=deploy WorkingDirectory=/home/deploy/my_app Environment="RAILS_ENV=production" Environment="PORT=3000" Environment="PIDFILE=/home/deploy/my_app/tmp/pids/server.pid" Environment="LOGFILE=/home/deploy/my_app/log/puma.log" # Adjust WEB_CONCURRENCY and RAILS_MAX_THREADS based on Linode instance size Environment="WEB_CONCURRENCY=2" Environment="RAILS_MAX_THREADS=5" ExecStart=/usr/local/bin/bundle exec puma -C /home/deploy/my_app/config/puma.rb ExecStop=/bin/kill -s TERM $MAINPID Restart=on-failure [Install] WantedBy=multi-user.target
Commands:
sudo systemctl daemon-reload sudo systemctl enable myapp.service sudo systemctl start myapp.service sudo systemctl status myapp.service
Cost Optimization: By running two application servers, we gain redundancy. If one fails, the other continues to serve traffic. We can also choose smaller, less expensive Linode instances for these servers and scale horizontally by adding more instances if needed, rather than paying for a single, massive, and potentially underutilized instance.
Web Servers: Nginx as a Reverse Proxy
Two Nginx instances will sit in front of the Puma application servers. They will handle SSL termination, serve static assets directly (reducing load on Puma), and proxy dynamic requests to the appropriate Puma instance.
Nginx Configuration
Create a configuration file for your application (e.g., `/etc/nginx/sites-available/myapp`):
upstream puma_backend {
# Health check for the first app server
server 192.168.1.10:3000; # Replace with actual IP of App Server 1
server 192.168.1.11:3000; # Replace with actual IP of App Server 2
# Add more app servers if you scale horizontally
}
server {
listen 80;
server_name your_domain.com www.your_domain.com;
# Redirect HTTP to HTTPS
location / {
return 301 https://$host$request_uri;
}
}
server {
listen 443 ssl http2;
server_name your_domain.com www.your_domain.com;
# SSL Configuration (using Let's Encrypt/Certbot)
ssl_certificate /etc/letsencrypt/live/your_domain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/your_domain.com/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
# Serve static assets directly
root /home/deploy/my_app/public;
location ~ ^/(assets|packs)/ {
gzip_static on;
expires max;
add_header Cache-Control public;
}
# Proxy dynamic requests to Puma
location / {
proxy_pass http://puma_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300s; # Increase timeout for long-running requests
proxy_connect_timeout 75s;
}
# Optional: Handle WebSocket connections
# if ($http_upgrade = "websocket") {
# proxy_set_header Upgrade $http_upgrade;
# proxy_set_header Connection "upgrade";
# }
access_log /var/log/nginx/myapp.access.log;
error_log /var/log/nginx/myapp.error.log;
}
Explanation:
upstream puma_backend: Defines a group of backend servers (your Puma instances). Nginx will round-robin requests to these servers.proxy_pass http://puma_backend;: Forwards requests to the upstream group.proxy_set_header: Passes important client information to the backend application.ssl_certificateandssl_certificate_key: Paths to your SSL certificates. Certbot is recommended for automated Let’s Encrypt certificate management.rootandlocation ~ ^/(assets|packs)/: Directs Nginx to serve static files from the application’s public directory, bypassing Puma.
Commands:
sudo ln -s /etc/nginx/sites-available/myapp /etc/nginx/sites-enabled/ sudo nginx -t # Test configuration sudo systemctl restart nginx
Load Balancer: HAProxy for High Availability
To achieve true high availability for the web tier, we need a load balancer. While Linode offers managed load balancers, deploying HAProxy on a dedicated, small Linode instance provides a cost-effective and highly configurable solution. This HAProxy instance will distribute traffic to the two Nginx web servers.
HAProxy Configuration (`/etc/haproxy/haproxy.cfg`)
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend http_frontend
bind *:80
mode http
# Redirect all HTTP traffic to HTTPS
redirect scheme https if !{ ssl_fc }
frontend https_frontend
bind *:443 ssl crt /etc/ssl/private/your_domain.com.pem # Combine your cert and key here
mode http
# Use stick-tables for session persistence if needed (e.g., for shopping carts)
# stick-table type ip size 100k expire 30m store gpc0
# stick on src
# Health check for Nginx servers
acl is_healthy nbsrv(nginx_backend) gt 0
http-request deny if !is_healthy # Optional: deny if all backends are down
default_backend nginx_backend
backend nginx_backend
mode http
balance roundrobin
option httpchk GET / HTTP/1.1\r\nHost:\ your_domain.com # Basic HTTP health check
http-check expect status 200 # Expect a 200 OK from health check
# Replace with actual IPs of your Nginx servers
server nginx1 192.168.1.20:80 check
server nginx2 192.168.1.21:80 check
# Add more Nginx servers if you scale horizontally
Explanation:
frontend http_frontendandfrontend https_frontend: Define listeners for HTTP and HTTPS traffic. The HTTP frontend redirects to HTTPS.bind *:443 ssl crt ...: Configures SSL termination on HAProxy. You’ll need to combine your certificate and private key into a single file for HAProxy.backend nginx_backend: Defines the pool of Nginx servers.balance roundrobin: Distributes traffic evenly.option httpchk: Configures HAProxy to periodically send an HTTP request to the backend servers to check their health.server ... check: Marks the backend servers and enables health checking.
Commands:
sudo systemctl enable haproxy sudo systemctl start haproxy sudo systemctl status haproxy
Cost Optimization: A single, small HAProxy instance is sufficient for load balancing. This is significantly cheaper than managed load balancers for many use cases. The health checks ensure that traffic is only sent to healthy Nginx instances, maintaining availability.
Deployment and Zero-Downtime Strategy
Achieving zero-downtime deployments is critical. The architecture supports this:
- Deploy to one App Server at a time:
- Update the code on one application server.
- Run database migrations (carefully, ensuring backward compatibility).
- Restart Puma on that server.
- Monitor logs and application health.
- Once confirmed healthy, repeat the process for the second application server.
- Load Balancer Health Checks: HAProxy will automatically stop sending traffic to an unhealthy Nginx server during the deployment process. Nginx, in turn, will stop sending traffic to an unhealthy Puma instance.
- Database Migrations: Always ensure your migrations are backward-compatible. This means new code can run with the old database schema, and old code can run with the new schema. This allows you to deploy code changes incrementally.
Monitoring and Alerting
Robust monitoring is key to maintaining high availability and identifying cost-saving opportunities. Key metrics to track include:
- CPU and Memory Usage: On all servers (HAProxy, Nginx, Puma, Database).
- Network Traffic: Especially on the HAProxy and Nginx instances.
- Request Latency: Tracked at the Nginx and HAProxy levels.
- Error Rates: Monitor Nginx error logs and application logs for exceptions.
- Database Performance: Query times, connection counts, and disk I/O.
Tools like Prometheus with Grafana, or Linode’s built-in monitoring, can be leveraged. Set up alerts for critical thresholds (e.g., high CPU, elevated error rates, unresponsive servers).
Cost Optimization Summary
This architecture optimizes costs by:
- Horizontal Scaling: Using multiple smaller, cheaper Linode instances instead of a few large ones.
- Managed Database: Offloading database administration to Linode’s managed service, reducing operational overhead.
- Efficient Web Serving: Nginx serving static assets directly, reducing load on application servers.
- Open-Source Tooling: Relying on battle-tested open-source software like Nginx, Puma, and HAProxy.
- Predictable Pricing: Linode’s transparent pricing model allows for accurate budget forecasting.
By carefully selecting instance sizes, leveraging redundancy, and implementing efficient configurations, you can build a highly available and cost-effective Ruby stack on Linode that scales with your application’s needs.