The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and DynamoDB on Google Cloud for Shopify

Nginx as a High-Performance Frontend for Gunicorn/PHP-FPM

When deploying applications on Google Cloud, particularly those serving dynamic content via Python (Gunicorn) or PHP (PHP-FPM), Nginx serves as an indispensable frontend. Its strengths lie in efficient static file serving, SSL termination, request buffering, and load balancing. Properly tuning Nginx is critical for maximizing throughput and minimizing latency.

Nginx Configuration for Gunicorn (Python)

For Python applications managed by Gunicorn, Nginx acts as a reverse proxy. The key is to configure Nginx to efficiently pass requests to Gunicorn workers and handle responses. We’ll focus on connection management, buffering, and keep-alive settings.

Core Nginx Configuration (`nginx.conf`)

Start with essential global settings. The `worker_processes` directive should ideally match the number of CPU cores available to your instance. `worker_connections` dictates the maximum number of simultaneous connections a worker can handle. A common starting point is 1024 or higher, depending on expected load.

worker_processes auto; # Or set to the number of CPU cores
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
    worker_connections 4096; # Adjust based on load and system limits
    multi_accept on;
}

http {
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;

    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Logging settings
    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log warn;

    # Gzip compression for dynamic content
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # Include virtual host configurations
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

Gunicorn Server Block (`/etc/nginx/sites-available/your_app`)

This block defines how Nginx interacts with your Gunicorn application. Key directives include `proxy_pass` to direct traffic, `proxy_set_header` to forward essential client information, and buffer settings to manage large requests/responses.

server {
    listen 80;
    server_name your_domain.com www.your_domain.com;

    # Serve static files directly
    location /static/ {
        alias /path/to/your/app/static/;
        expires 30d;
        access_log off;
        add_header Cache-Control "public, max-age=2592000";
    }

    location /media/ {
        alias /path/to/your/app/media/;
        expires 30d;
        access_log off;
        add_header Cache-Control "public, max-age=2592000";
    }

    # Proxy requests to Gunicorn
    location / {
        proxy_pass http://unix:/run/gunicorn.sock; # Or http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Buffering settings
        proxy_connect_timeout 75s;
        proxy_send_timeout 75s;
        proxy_read_timeout 75s;
        proxy_buffer_size 16k;
        proxy_buffers 4 32k;
        proxy_busy_buffers_size 64k;
        proxy_temp_file_write_size 64k;

        # WebSocket support (if needed)
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }

    # Optional: Error pages
    error_page 500 502 503 504 /500.html;
    location = /500.html {
        root /usr/share/nginx/html;
    }
}

Gunicorn Configuration (`gunicorn_config.py`)

Gunicorn’s worker type and count significantly impact performance. For I/O-bound applications, `gevent` or `event` workers are often preferred. The number of workers typically ranges from `2 * num_cores + 1` to `2 * num_cores + 1` for synchronous workers, or more for asynchronous workers, depending on memory constraints.

import multiprocessing

bind = "unix:/run/gunicorn.sock" # Or "0.0.0.0:8000" if not using a socket
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "gevent" # or "sync", "event"
threads = 2 # If using sync worker class and need more concurrency per worker
timeout = 120 # Adjust based on expected request duration
keepalive = 2 # Seconds to keep worker alive for subsequent requests

# Logging
loglevel = "info"
accesslog = "-" # Log to stdout, which can be captured by systemd/journald
errorlog = "-"

# Other useful settings
# max_requests = 1000 # Restart worker after this many requests
# preload_app = True # Load app before forking workers

Nginx Configuration for PHP-FPM

When serving PHP applications, Nginx acts as a reverse proxy to PHP-FPM. The configuration focuses on passing PHP requests to the FPM pool and efficiently handling static assets.

PHP-FPM Server Block (`/etc/nginx/sites-available/your_php_app`)

This configuration is similar to the Gunicorn setup but directs requests to the PHP-FPM process manager.

server {
    listen 80;
    server_name your_php_domain.com www.your_php_domain.com;
    root /var/www/your_php_app/public; # Adjust to your document root
    index index.php index.html index.htm;

    # Serve static files directly
    location ~* \.(jpg|jpeg|gif|png|css|js|ico|woff|woff2|ttf|svg|eot)$ {
        expires 30d;
        access_log off;
        add_header Cache-Control "public, max-age=2592000";
    }

    # Pass PHP scripts to PHP-FPM
    location ~ \.php$ {
        try_files $uri =404;
        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_pass unix:/var/run/php/php7.4-fpm.sock; # Adjust PHP version and socket path
        fastcgi_index index.php;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include fastcgi_params;

        # FastCGI Buffering
        fastcgi_buffer_size 16k;
        fastcgi_buffers 4 32k;
        fastcgi_busy_buffers_size 64k;
        fastcgi_temp_file_write_size 64k;
    }

    # Deny access to .htaccess files, if Apache's document root
    # concurs with nginx's one
    location ~ /\.ht {
        deny all;
    }

    # Error pages
    error_page 404 /404.html;
    location = /404.html {
        internal;
    }
    error_page 500 502 503 504 /500.html;
    location = /500.html {
        internal;
    }
}

PHP-FPM Pool Configuration (`/etc/php/7.4/fpm/pool.d/www.conf`)

Tuning PHP-FPM pools is crucial. The `pm` (process manager) setting is key. `dynamic` is a good default, balancing resource usage. `pm.max_children` is the most critical parameter; set it based on available memory and the memory footprint of your PHP processes. `pm.start_servers`, `pm.min_spare_servers`, and `pm.max_spare_servers` help manage worker processes dynamically.

[www]
user = www-data
group = www-data
listen = /var/run/php/php7.4-fpm.sock ; Or a TCP socket like 127.0.0.1:9000
listen.owner = www-data
listen.group = www-data
listen.mode = 0660

pm = dynamic
pm.max_children = 50 ; Adjust based on RAM and typical PHP process size
pm.start_servers = 5
pm.min_spare_servers = 2
pm.max_spare_servers = 10
pm.process_idle_timeout = 10s
pm.max_requests = 500 ; Restart worker after this many requests to prevent memory leaks

request_terminate_timeout = 120s ; Max execution time for a script
; rlimit_files = 1024 ; Uncomment and adjust if you hit file descriptor limits

; Other settings
; catch_workers_output = yes ; Useful for debugging
; env[VARIABLE] = value

DynamoDB Performance Tuning for Shopify Applications

DynamoDB is a popular choice for scalable NoSQL workloads, often used by Shopify applications for product catalogs, order data, or user sessions. Optimizing DynamoDB involves understanding throughput provisioning, indexing strategies, and query patterns.

Provisioned Throughput vs. On-Demand

For predictable workloads, Provisioned Throughput is generally more cost-effective. You define Read Capacity Units (RCUs) and Write Capacity Units (WCUs). For spiky or unpredictable traffic, On-Demand capacity is simpler to manage, as DynamoDB automatically scales throughput. However, it can be more expensive for consistently high traffic.

Understanding RCUs and WCUs

Read Capacity Units (RCUs):

A strongly consistent read consumes 2 RCUs per 4KB of data.
An eventually consistent read consumes 1 RCU per 4KB of data.
Eventually consistent reads are the default and recommended for performance and cost savings when strong consistency is not strictly required (e.g., displaying product lists).

Write Capacity Units (WCUs):

Every write operation (PutItem, UpdateItem, DeleteItem, BatchWriteItem) consumes 1 WCU per 1KB of data.

Key Takeaway: Optimize your read consistency settings and minimize data transfer by projecting only necessary attributes.

Indexing Strategies: Primary Keys and Secondary Indexes

The choice of Partition Key (PK) and Sort Key (SK) for your primary key is paramount. A good PK distributes data evenly across partitions to avoid hot spots. A good SK allows efficient querying within a partition.

Global Secondary Indexes (GSIs)

GSIs are essential for querying data on attributes other than the primary key. When creating a GSI, consider:

Projected Attributes: Projecting only the attributes needed for your queries (KEYS_ONLY, INCLUDE, or ALL) significantly impacts performance and cost. ALL is the most expensive.
Throughput: GSIs have their own provisioned throughput. Ensure they are adequately provisioned, especially if they are frequently queried.
Key Structure: Design GSI PKs and SKs to support your common query patterns.

Local Secondary Indexes (LSIs)

LSIs share the same partition key as the base table but have a different sort key. They are useful for querying different subsets of data within the same partition. However, they have limitations: they must be created within 7 days of the table creation and cannot be deleted or modified. GSIs are generally more flexible.

Query Optimization Techniques

1. Use `Query` over `Scan` whenever possible:

Scan operations read every item in a table or index, which is inefficient and costly for large tables. Query operations use the primary key or a GSI to retrieve items efficiently.

Example: Fetching Shopify Products by Vendor (using a GSI)

Assume a `Products` table with PK `product_id` and a GSI named `ProductsByVendor` with PK `vendor_name` and SK `created_at`.

import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Products')

vendor_name_to_query = 'Acme Corp'

response = table.query(
    IndexName='ProductsByVendor',
    KeyConditionExpression=Key('vendor_name').eq(vendor_name_to_query),
    ScanIndexForward=False, # Order by created_at descending
    ProjectionExpression="product_id, product_name, price" # Project only needed attributes
)

items = response['Items']
# Process items...

2. Efficiently Handle Large Datasets:

DynamoDB returns a maximum of 1MB of data per `Query` or `Scan` operation. Use the LastEvaluatedKey from the response to paginate and retrieve subsequent pages of results.

def get_all_items(table, index_name=None, key_condition=None, projection=None):
    items = []
    last_evaluated_key = None
    while True:
        query_kwargs = {}
        if index_name:
            query_kwargs['IndexName'] = index_name
        if key_condition:
            query_kwargs['KeyConditionExpression'] = key_condition
        if projection:
            query_kwargs['ProjectionExpression'] = projection
        if last_evaluated_key:
            query_kwargs['ExclusiveStartKey'] = last_evaluated_key

        response = table.query(**query_kwargs)
        items.extend(response['Items'])

        last_evaluated_key = response.get('LastEvaluatedKey')
        if not last_evaluated_key:
            break
    return items

# Example usage:
# products_by_vendor = get_all_items(
#     table,
#     index_name='ProductsByVendor',
#     key_condition=Key('vendor_name').eq('Acme Corp'),
#     projection="product_id, product_name"
# )

3. Batch Operations:

Use BatchGetItem for retrieving multiple items from different tables or with different keys efficiently. Use BatchWriteItem for writing/deleting multiple items. Be mindful of the 25-item limit per BatchWriteItem request and the 100-item limit for BatchGetItem.

# Example BatchGetItem
response = table.batch_get_item(
    RequestItems={
        'Products': {
            'Keys': [
                {'product_id': 'prod_123'},
                {'product_id': 'prod_456'}
            ],
            'ProjectionExpression': 'product_id, price'
        },
        'Orders': { # Assuming another table
            'Keys': [
                {'order_id': 'order_abc'},
                {'order_id': 'order_def'}
            ]
        }
    }
)
# Handle unprocessed items if any

Monitoring and Alarming

Utilize CloudWatch metrics for both Nginx/Gunicorn/PHP-FPM and DynamoDB. Key metrics to monitor include:

Nginx: `requests`, `connections`, `nginx_http_requests_total` (if using Prometheus exporter).
Gunicorn/PHP-FPM: Worker status, request latency, error rates.
DynamoDB: `ConsumedReadCapacityUnits`, `ConsumedWriteCapacityUnits`, `ThrottledRequests`, `SystemErrors`, `Latency`.

Set up CloudWatch Alarms for critical thresholds, such as throttled requests on DynamoDB or high error rates on your application servers. This proactive monitoring is essential for maintaining a stable and performant Shopify infrastructure on Google Cloud.