Building a High-Availability, Cost-Optimized Ruby Stack on OVH
Strategic OVH Instance Selection for Ruby Workloads
When architecting a high-availability, cost-optimized Ruby stack on OVH, the initial instance selection is paramount. OVH’s Public Cloud offerings provide a spectrum of instance types, each with distinct performance characteristics and pricing models. For Ruby applications, particularly those leveraging the Rails framework, I/O performance and predictable CPU allocation are critical. We’ll focus on the “General Purpose” instances, specifically the GRA1-SSD-XXX series, which offer a balance of compute, memory, and NVMe SSD storage at competitive price points. Avoid “High Performance” instances unless profiling definitively proves a bottleneck in raw CPU clock speed or specific instruction sets, which is rare for typical web applications. The “Storage Optimized” instances are generally overkill and prohibitively expensive for a standard Ruby web tier.
The key to cost optimization here is right-sizing. OVH’s billing is typically hourly, making it crucial to select an instance that meets peak demand without significant over-provisioning. For a typical medium-traffic Rails application, starting with a 2-vCPU, 4GB RAM instance (e.g., GRA1-SSD-2) is a sensible baseline. Monitor resource utilization closely (CPU, memory, I/O wait) and scale horizontally or vertically based on empirical data, not assumptions. OVH’s “Public Cloud Instances” page provides detailed specifications and pricing, which should be your primary reference.
Setting Up a Highly Available PostgreSQL Cluster with Patroni
A robust PostgreSQL cluster is the backbone of most Ruby applications. For high availability, we’ll deploy Patroni, a template for PostgreSQL HA. This setup will involve at least three PostgreSQL nodes for quorum and a distributed configuration store, typically etcd or Consul. OVH’s “Bare Metal” servers are often more cost-effective for dedicated database nodes than Public Cloud instances, especially for sustained workloads. However, for simplicity and integration with the Public Cloud ecosystem, we’ll use Public Cloud instances for this example, focusing on the GRA1-SSD-XXX series again, but potentially with more RAM and faster SSDs for database roles.
Prerequisites:
- Three OVH Public Cloud instances (e.g., GRA1-SSD-4 or GRA1-SSD-8)
- A separate etcd cluster (can be deployed on dedicated instances or even as a managed service if available and cost-effective)
- SSH access to all instances
- Root or sudo privileges
1. Install PostgreSQL and Patroni on each database node:
We’ll use `apt` for Debian/Ubuntu. Ensure your PostgreSQL version is compatible with your Ruby application’s ORM (e.g., ActiveRecord). It’s generally recommended to use a recent, stable PostgreSQL version.
sudo apt update sudo apt install -y postgresql postgresql-contrib python3-pip python3-venv sudo pip3 install --upgrade pip sudo pip3 install "patroni[etcd]"
2. Configure etcd:
Ensure your etcd cluster is accessible from the PostgreSQL nodes. For this example, we assume etcd is running on etcd-01.example.com:2379, etcd-02.example.com:2379, and etcd-03.example.com:2379.
3. Create Patroni configuration files:
On each PostgreSQL node, create a patroni.yml file. This configuration specifies the PostgreSQL data directory, the etcd endpoints, and the PostgreSQL superuser credentials.
# /etc/patroni/patroni.yml
scope: my_ruby_app_db
namespace: /service/
etcd:
host: etcd-01.example.com:2379,etcd-02.example.com:2379,etcd-03.example.com:2379
protocol: http
postgresql:
listen: 0.0.0.0:5432
data_dir: /var/lib/postgresql/14/main # Adjust version as needed
bin_dir: /usr/lib/postgresql/14/bin # Adjust version as needed
config_dir: /etc/postgresql/14/main # Adjust version as needed
pg_hba:
- host all all 0.0.0.0/0 md5
parameters:
max_connections: 200
shared_buffers: 1GB # Adjust based on instance RAM
effective_cache_size: 3GB # Adjust based on instance RAM
maintenance_work_mem: 256MB
wal_level: replica
wal_sync_method: fsync
wal_writer_delay: 200ms
checkpoint_completion_target: 0.9
checkpoint_timeout: 5min
max_wal_size: 1GB
min_wal_size: 128MB
random_page_cost: 1.1
effective_io_concurrency: 200 # For SSDs
work_mem: 16MB
log_destination: stderr
logging_collector: on
log_directory: pg_log
log_filename: postgresql-%Y-%m-%d_%H%M%S.log
log_line_prefix: '%t [%p]: '
log_min_duration_statement: 250ms
log_checkpoints: on
log_connections: on
log_disconnections: on
log_lock_waits: on
log_temp_files: 0
log_autovacuum_min_duration: 0
autovacuum: on
autovacuum_max_workers: 3
autovacuum_naptime: 1min
autovacuum_vacuum_threshold: 50
autovacuum_analyze_threshold: 50
vacuum_cost_delay: 10ms
vacuum_cost_page_hit: 1
vacuum_cost_page_miss: 10
vacuum_cost_page_dirty: 20
vacuum_cost_limit: 1000
replication:
username: replicator
password: your_replication_password
create_replica_methods:
- postgresql.reinitdb
- postgresql.pg_basebackup
restapi:
listen: 0.0.0.0:8008
connect_address: %(host)s:8008
switchover:
max_lag_on_failover: 1048576 # 1MB
max_failover_replication_lag: 1048576 # 1MB
Important Notes on Configuration:
- Replace
my_ruby_app_dbwith a unique scope name for your cluster. - Adjust
namespaceif you have other services using etcd. - Ensure
data_dir,bin_dir, andconfig_dirmatch your PostgreSQL installation. - Set a strong
your_replication_password. - Tune PostgreSQL
parametersbased on your instance’s RAM and expected load. The provided values are a starting point. - The
effective_io_concurrencyshould be set to a value appropriate for your NVMe SSDs (often 200 or higher). pg_hbaallows all hosts to connect; in production, restrict this to your application server IPs.
4. Create PostgreSQL user and database for replication:
On one of the PostgreSQL nodes (it doesn’t matter which one initially, as Patroni will manage it), run the following commands as the postgres user:
sudo -u postgres psql -c "CREATE USER replicator WITH REPLICATION PASSWORD 'your_replication_password';" sudo -u postgres psql -c "ALTER USER replicator CREATEDB;"
5. Initialize Patroni:
On each node, start Patroni. The first node to start will initialize the cluster and become the primary. Subsequent nodes will join as replicas.
sudo patroni /etc/patroni/patroni.yml
6. Verify Cluster Status:
You can check the status via the Patroni REST API or by connecting to PostgreSQL.
# Via Patroni API (on any node) curl http://localhost:8008/patroni # Via psql (connect to the current primary) # Find the primary IP using 'patroni' command or by checking etcd sudo -u postgres psql -h <primary_ip> -c "SELECT pg_is_in_recovery();"
The output of pg_is_in_recovery() should be f for the primary and t for replicas.
Deploying Ruby Application Servers with HAProxy for Load Balancing
For the application tier, we’ll deploy multiple instances of our Ruby application (e.g., Puma or Unicorn) and use HAProxy for load balancing and health checks. This provides both scalability and high availability. OVH’s “Load Balancer” service is an option, but for greater control and potentially lower cost if you’re already managing instances, deploying HAProxy on a dedicated instance or one of the application servers is a common pattern.
1. Set up Application Servers:
Deploy your Ruby application on at least two (preferably three or more for better HA) OVH Public Cloud instances (e.g., GRA1-SSD-2). Ensure your application servers are configured to listen on a specific port (e.g., 3000) and are accessible from the HAProxy instance.
2. Install and Configure HAProxy:
On a dedicated instance or one of your app servers (if you’re willing to co-locate), install HAProxy.
sudo apt update sudo apt install -y haproxy
Edit the HAProxy configuration file /etc/haproxy/haproxy.cfg.
# /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend http_frontend
bind *:80
mode http
default_backend http_backend
backend http_backend
mode http
balance roundrobin
option httpchk GET /health # Assuming your app has a /health endpoint
http-check expect status 200
# Replace with your actual application server IPs and ports
server app1 10.0.0.1:3000 check
server app2 10.0.0.2:3000 check
server app3 10.0.0.3:3000 check
# Optional: HAProxy Stats Page
listen stats
bind *:8404
mode http
stats enable
stats uri /stats
stats realm Haproxy\ Statistics
stats auth admin:your_stats_password # Change this password!
Explanation:
- The
globalanddefaultssections set up general HAProxy behavior. frontend http_frontendlistens on port 80 for incoming HTTP requests.backend http_backenddefines the pool of application servers.balance roundrobindistributes traffic evenly. Other options likeleastconncan be beneficial.option httpchkandhttp-check expect status 200configure health checks. HAProxy will periodically send a GET request to/healthon each app server and remove unhealthy servers from rotation. Ensure your Ruby app has a lightweight/healthendpoint that returns a 200 OK.server appX ... checkdefines each application server. Replace10.0.0.xwith the actual private IPs of your app servers.- The
statssection provides a web interface to monitor HAProxy’s status. Secure it with a strong password.
3. Enable and Start HAProxy:
sudo systemctl enable haproxy sudo systemctl start haproxy
4. Firewall Configuration:
Ensure that port 80 is open on your HAProxy instance’s firewall and that your application servers allow incoming connections on port 3000 from the HAProxy instance’s IP address. OVH’s security groups or firewall rules need to be configured accordingly.
Cost Optimization Strategies and Monitoring
Achieving cost optimization on OVH requires continuous monitoring and iterative adjustments. The strategies outlined above – right-sizing instances, choosing appropriate services, and implementing HA – are foundational. Here are further steps:
1. Right-Sizing Instances:
Regularly review instance utilization. OVH’s control panel provides basic metrics. For deeper insights, deploy monitoring agents (e.g., Prometheus Node Exporter, Datadog agent) on your instances. If CPU utilization consistently hovers below 30% and memory is ample, consider downsizing. Conversely, if CPU is pegged at 90%+ or memory is constantly exhausted, scale up or out.
2. Database Connection Pooling:
Ensure your Ruby application uses a robust connection pooler like pg_pool or the built-in pooling in newer Rails versions. Over-allocating database connections is a common performance and cost sink. Tune the pool size to match your application’s concurrency and database capacity.
# config/database.yml (Rails example) production: adapter: postgresql encoding: unicode database: my_app_production pool: 5 # Adjust this value based on monitoring username: my_app_user password: <your_db_password> host: <your_primary_db_ip> port: 5432
3. Caching Strategies:
Implement aggressive caching at various levels: HTTP caching (e.g., using Varnish or CDN), application-level caching (e.g., Redis, Memcached), and fragment caching within your views. This reduces database load and application server processing, allowing you to use smaller, cheaper instances.
4. Autoscaling (Consideration):
While OVH’s Public Cloud doesn’t offer native autoscaling groups like AWS or GCP, you can implement custom autoscaling. This typically involves a separate orchestration service (e.g., Kubernetes, or a custom script) that monitors metrics and automatically provisions/decommissions instances based on predefined thresholds. For cost optimization, this is crucial for handling variable traffic loads without paying for peak capacity 24/7. However, it adds significant operational complexity.
5. Monitoring and Alerting:
Set up comprehensive monitoring for CPU, memory, disk I/O, network traffic, and application-specific metrics (e.g., request latency, error rates). Configure alerts for critical thresholds. OVH’s built-in monitoring is a starting point, but integrating with tools like Prometheus/Grafana or commercial solutions provides more granular control and historical data for cost analysis.
By combining strategic instance selection, robust HA patterns for databases and application servers, and diligent cost optimization practices, you can build a performant and resilient Ruby stack on OVH that aligns with your budget.