The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MySQL on Google Cloud for Ruby

Nginx as a High-Performance Frontend Proxy

When deploying Ruby applications on Google Cloud, Nginx serves as an indispensable frontend proxy. Its primary roles include SSL termination, static file serving, load balancing, and request buffering. Optimizing Nginx is crucial for minimizing latency and maximizing throughput.

A common configuration involves Nginx proxying requests to a backend application server like Gunicorn (for Python, but conceptually similar for Ruby with Puma/Unicorn) or PHP-FPM. For Ruby, we’ll focus on Nginx forwarding to Puma or Unicorn via a Unix socket or TCP port.

Nginx Configuration Tuning

The core of Nginx performance tuning lies within its nginx.conf file, typically located at /etc/nginx/nginx.conf or within /etc/nginx/conf.d/. Key directives to scrutinize include:

worker_processes: Set this to the number of CPU cores available on your instance. For optimal performance, it’s often recommended to set it to the number of cores or auto.
worker_connections: This defines the maximum number of simultaneous connections that each worker process can handle. A common starting point is 1024, but this can be increased significantly based on your application’s needs and system limits (ulimit -n).
keepalive_timeout: Controls the timeout for keep-alive connections. A value between 60 and 75 seconds is usually a good balance.
client_max_body_size: Important for handling file uploads. Set this to an appropriate size (e.g., 50M for 50 megabytes).
sendfile: Set to on to enable efficient file transfer from the OS kernel’s page cache.
tcp_nopush and tcp_nodelay: Set to on for improved network performance.

Consider these settings in your nginx.conf:

Example `nginx.conf` Snippet

worker_processes auto;
worker_connections 4096;
events {
    use epoll;
    worker_connections 4096;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    sendfile        on;
    tcp_nopush      on;
    tcp_nodelay     on;

    keepalive_timeout  65;
    keepalive_requests 1000;

    client_max_body_size 50m;

    # Gzip compression for text-based assets
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # Proxy configuration for your Ruby app
    server {
        listen 80;
        server_name your_domain.com;

        location / {
            proxy_pass http://unix:/path/to/your/app.sock; # Or http://127.0.0.1:3000
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_read_timeout 300s; # Increase timeout for long-running requests
            proxy_connect_timeout 75s;
        }

        # Serve static assets directly from Nginx
        location ~ ^/(assets|images|javascripts|stylesheets)/ {
            root /path/to/your/rails/public;
            expires 1y;
            add_header Cache-Control public;
        }
    }
}

Optimizing Static File Serving

Nginx excels at serving static files. Ensure your Ruby framework (e.g., Rails) precompiles assets and that Nginx is configured to serve them directly from the public/ directory. This offloads significant work from your application server.

Key directives for static file serving include:

root: Specifies the document root for the location.
expires: Sets the Expires and Cache-Control headers for client-side caching.
add_header Cache-Control public;: Ensures that intermediate caches (like CDNs) can also cache the assets.

Gunicorn/Puma/Unicorn: The Application Server Layer

For Ruby applications, Puma and Unicorn are the de facto standard application servers. They manage the Ruby processes that handle incoming HTTP requests from Nginx. Tuning these servers is critical for request processing speed and concurrency.

Puma Configuration

Puma is a multi-threaded server. Its configuration is typically done via a config/puma.rb file in your application’s root directory.

Key tuning parameters for Puma:

workers: The number of worker processes. This should generally align with the number of CPU cores available to your application instance, minus one for the master process.
threads: The number of threads per worker. This is a crucial setting for multi-threaded servers. A common starting point is 5 to 10 threads per worker. The optimal number depends heavily on your application’s I/O-bound vs. CPU-bound nature.
environment: Set to production.
pidfile: Path to the PID file.
state_path: Path to the state file.
bind: The address and port Puma listens on. For Nginx proxying via TCP, this would be tcp://127.0.0.1:3000. For Unix sockets, it’s unix:///path/to/your/app.sock.

Example `config/puma.rb`

# config/puma.rb
workers Integer(ENV.fetch("WEB_CONCURRENCY") { 2 }) # Number of worker processes
threads_count = Integer(ENV.fetch("RAILS_MAX_THREADS") { 5 }) # Number of threads per worker
threads threads_count, threads_count

environment ENV.fetch("RAILS_ENV") { "production" }

pidfile ENV.fetch("PIDFILE") { "tmp/pids/puma.pid" }
state_path ENV.fetch("STATEPATH") { "tmp/pids/puma.state" }

activate_control_app

on_worker_boot do
  # Worker specific setup, e.g. database connection pooling
  ActiveRecord::Base.establish_connection if defined?(ActiveRecord::Base)
end

# Allow Puma to be restarted by `rails restart` command.
plugin :tmp_restart

# Bind to a TCP port or Unix socket
# For TCP:
bind "tcp://127.0.0.1:3000"
# For Unix Socket (recommended with Nginx):
# bind "unix:///path/to/your/app.sock"

# Configure logging
stdout_redirect "#{__dir__}/log/puma.stdout.log", "#{__dir__}/log/puma.stderr.log", true

To start Puma with these settings:

WEB_CONCURRENCY=4 RAILS_MAX_THREADS=8 bundle exec puma -C config/puma.rb

Unicorn Configuration

Unicorn is a pre-forking worker model server. It spawns multiple worker processes, each handling requests independently. Its configuration is typically done via a config/unicorn.rb file.

Key tuning parameters for Unicorn:

worker_processes: The number of worker processes. This should generally be set to the number of CPU cores available.
preload_app: Set to true. This loads the application code before forking workers, reducing memory duplication and startup time for each worker.
timeout: The number of seconds to wait for a worker to respond before killing it.
listen: The address and port or Unix socket Unicorn listens on.
pid: Path to the PID file.

Example `config/unicorn.rb`

# config/unicorn.rb
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3) # Number of worker processes
preload_app true
timeout 60 # Timeout in seconds

# Listen on a Unix socket (recommended with Nginx)
listen "/path/to/your/app.sock", :backlog => 64
# Or listen on a TCP port
# listen "127.0.0.1:3000", :backlog => 64

pid "/path/to/your/app.pid"

# Logging
stderr_path "/path/to/your/log/unicorn.stderr.log"
stdout_path "/path/to/your/log/unicorn.stdout.log"

# Callbacks
before_fork do |server, worker|
  # Before forking, close existing DB connections
  defined?(ActiveRecord::Base) && ActiveRecord::Base.connection.disconnect!
end

after_fork do |server, worker|
  # After forking, establish new DB connections
  defined?(ActiveRecord::Base) && ActiveRecord::Base.establish_connection
end

To start Unicorn:

WEB_CONCURRENCY=4 bundle exec unicorn -c config/unicorn.rb

MySQL/PostgreSQL Tuning on Google Cloud

Database performance is often the bottleneck in web applications. Optimizing your database server, whether it’s MySQL or PostgreSQL, is paramount. On Google Cloud, this involves both instance-level tuning and database-specific configuration.

Instance Sizing and Storage

Choose an appropriate machine type for your database instance. For I/O-intensive workloads, consider instances with local SSDs for significantly lower latency and higher IOPS compared to network-attached storage. However, local SSDs are ephemeral, so ensure you have robust backup and replication strategies.

For managed services like Cloud SQL, select an instance size that provides sufficient CPU, memory, and IOPS for your workload. Monitor Cloud SQL metrics closely to identify under-provisioning.

MySQL Configuration Tuning

The primary configuration file for MySQL is my.cnf (or my.ini on Windows), often found at /etc/mysql/my.cnf or /etc/my.cnf.

Key `my.cnf` Directives

innodb_buffer_pool_size: This is arguably the most critical setting for InnoDB. It determines how much memory is allocated for caching data and indexes. A common recommendation is 50-75% of the instance’s available RAM.
innodb_log_file_size: Larger log files can improve write performance by reducing flushing frequency, but increase recovery time after a crash. A common starting point is 256MB or 512MB.
innodb_flush_log_at_trx_commit: Setting this to 1 (default) provides full ACID compliance but can impact write performance. Setting it to 2 offers a good balance, flushing logs to the OS cache on commit but syncing to disk once per second, which is often sufficient for many applications and significantly faster. Setting to 0 is fastest but risks data loss on crash.
max_connections: The maximum number of simultaneous client connections. Set this based on your application’s needs and available memory.
query_cache_size and query_cache_type: The query cache is deprecated in newer MySQL versions and often causes more contention than benefit. It’s generally recommended to disable it (query_cache_size = 0, query_cache_type = 0) and rely on application-level caching or other mechanisms.
tmp_table_size and max_heap_table_size: Control the maximum size of in-memory temporary tables. Increasing these can help complex queries that require temporary tables.

Example `my.cnf` Snippet

[mysqld]
# General Settings
user                    = mysql
pid-file                = /var/run/mysqld/mysqld.pid
socket                  = /var/run/mysqld/mysqld.sock
port                    = 3306
basedir                 = /usr
datadir                 = /var/lib/mysql
tmpdir                  = /tmp
lc_messages_dir         = /usr/share/mysql
lc_messages             = en_US

# InnoDB Settings
innodb_buffer_pool_size = 4G  # Adjust based on available RAM (e.g., 4GB for a 8GB RAM instance)
innodb_log_file_size    = 512M
innodb_flush_log_at_trx_commit = 2 # Good balance of performance and durability
innodb_flush_method     = O_DIRECT # Recommended for performance on Linux

# Connection Settings
max_connections         = 200
# thread_cache_size       = 16 # Adjust based on connection churn

# Query Cache (Deprecated, generally disable)
query_cache_size        = 0
query_cache_type        = 0

# Temporary Tables
tmp_table_size          = 64M
max_heap_table_size     = 64M

# Logging
log_error = /var/log/mysql/error.log
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 2 # Log queries taking longer than 2 seconds

After modifying my.cnf, restart the MySQL service:

sudo systemctl restart mysql

PostgreSQL Configuration Tuning

PostgreSQL’s primary configuration file is postgresql.conf, typically located in the data directory (e.g., /var/lib/postgresql/14/main/postgresql.conf).

Key `postgresql.conf` Directives

shared_buffers: Similar to innodb_buffer_pool_size, this is the most important parameter. A common recommendation is 25% of the instance’s total RAM.
work_mem: The amount of memory to use for internal sort operations and hash tables before writing to temporary disk files. This is allocated per sort operation, so be cautious with high values.
maintenance_work_mem: The amount of memory to use for maintenance operations like VACUUM, CREATE INDEX, and ALTER TABLE. A larger value can significantly speed up these operations.
effective_cache_size: An estimate of how much memory is available for disk caching by the operating system and PostgreSQL itself. Setting this to 50-75% of total RAM helps the query planner make better decisions.
wal_buffers: Buffers for WAL (Write-Ahead Logging) data. A value of -1 (auto-tuning) or 16MB is often a good starting point.
max_connections: Maximum number of concurrent connections.
logging_collector, log_directory, log_filename, log_statement: Configure logging for performance analysis, especially for slow queries.

Example `postgresql.conf` Snippet

# postgresql.conf
# General Settings
listen_addresses = '*' # Or specific IPs
port = 5432

# Resource Usage
shared_buffers = 2GB       # Adjust based on RAM (e.g., 2GB for 8GB RAM instance)
work_mem = 32MB            # Per sort/hash operation
maintenance_work_mem = 256MB # For VACUUM, CREATE INDEX, etc.
effective_cache_size = 4GB # Estimate of OS + PG cache

# WAL Settings
wal_buffers = 16MB
wal_level = replica
synchronous_commit = on    # Or 'local' for higher performance with slight durability trade-off
fsync = on
full_page_writes = on

# Connection Settings
max_connections = 150

# Logging
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_statement = 'none'     # Set to 'ddl' or 'all' for debugging, 'auto' for slow queries
log_min_duration_statement = 1000 # Log statements taking longer than 1000ms (1 second)

After modifying postgresql.conf, reload or restart the PostgreSQL service:

sudo systemctl reload postgresql

Monitoring and Iterative Tuning

Performance tuning is not a one-time event. Continuous monitoring and iterative adjustments are key to maintaining optimal performance. Utilize Google Cloud’s monitoring tools (Cloud Monitoring, Cloud Logging) and application-specific performance monitoring (APM) solutions.

Key Metrics to Monitor

Nginx: Request rate, response times (latency), error rates (4xx, 5xx), connection counts, worker connections.
Application Server (Puma/Unicorn): Request queue length, worker utilization, thread utilization (Puma), memory usage, CPU usage, garbage collection times.
Database (MySQL/PostgreSQL): Query latency, slow queries, connection counts, buffer pool hit ratio (MySQL), cache hit ratio (PostgreSQL), I/O wait times, CPU and memory utilization.
System Metrics: CPU utilization, memory usage, disk I/O, network traffic.

Diagnostic Tools

Nginx: nginx -T (show config), access.log, error.log, ngx_http_stub_status_module.
Application Server: Logs, profiling tools (e.g., ruby-prof, stackprof), APM agents.
Database: EXPLAIN ANALYZE (PostgreSQL), EXPLAIN (MySQL), slow query logs, performance schema (MySQL), pg_stat_statements (PostgreSQL).
System: top, htop, vmstat, iostat, netstat.

Start with small, incremental changes. Measure the impact of each change. Document your tuning process and the rationale behind each adjustment. This systematic approach ensures that your Ruby application on Google Cloud remains performant and scalable.

The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MySQL on Google Cloud for Ruby

Nginx as a High-Performance Frontend Proxy

Nginx Configuration Tuning

Example nginx.conf Snippet

Optimizing Static File Serving

Gunicorn/Puma/Unicorn: The Application Server Layer

Puma Configuration

Example config/puma.rb

Unicorn Configuration

Example config/unicorn.rb

MySQL/PostgreSQL Tuning on Google Cloud

Instance Sizing and Storage

MySQL Configuration Tuning

Key my.cnf Directives

Example my.cnf Snippet

PostgreSQL Configuration Tuning

Key postgresql.conf Directives

Example postgresql.conf Snippet