Scaling Shopify on OVH to Handle 50,000+ Concurrent Requests
Architectural Foundation: Decoupling and Caching Layers
Achieving 50,000+ concurrent requests for a Shopify store, especially when hosted on a flexible infrastructure like OVH, necessitates a robust, multi-layered architecture. The core principle is aggressive decoupling and intelligent caching. Shopify’s inherent SaaS nature means we can’t directly scale the application servers. Instead, we focus on optimizing the ingress path, externalizing state, and offloading as much work as possible from the Shopify platform itself.
Our strategy involves a dedicated reverse proxy/load balancer layer, a robust CDN, and a sophisticated caching mechanism for API calls and static assets. This forms the first line of defense, absorbing the bulk of the traffic and serving cached responses without ever hitting Shopify’s core infrastructure.
OVH Infrastructure Setup: Load Balancer and CDN Configuration
For the OVH infrastructure, we’ll leverage a combination of their dedicated load balancing services and a robust Content Delivery Network (CDN). A common setup involves using HAProxy or Nginx as the edge load balancer, fronting multiple application instances (if any custom middleware is involved) or directly pointing to Shopify’s domain with advanced caching rules.
HAProxy Configuration for Shopify Traffic
While HAProxy is typically used for load balancing backend servers, we can adapt it for advanced caching and request routing for a SaaS platform like Shopify. The key is to use its powerful ACLs and caching directives.
Caching Static Assets and API Responses
We’ll configure HAProxy to cache static assets (images, CSS, JS) served directly from Shopify’s CDN, and more importantly, cache responses from Shopify’s Storefront API and Admin API for frequently accessed, non-dynamic data. This requires careful consideration of cache invalidation strategies.
# /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
frontend shopify_frontend
bind *:80
bind *:443 ssl crt /etc/ssl/certs/your_domain.pem
# ACLs for caching
acl is_static_asset path_reg -i \.(css|js|jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$
acl is_api_endpoint path_beg /api/2023-10/
# Cache static assets
http-request cache use src
http-cacheable-request method GET url_beg /cdn-cgi/ /assets/ /media/
# Cache API responses (e.g., product listings, collections)
# This requires careful consideration of cache keys and TTLs.
# For Storefront API, cache based on query parameters and resource IDs.
# For Admin API, be extremely cautious and only cache read-only, non-sensitive data.
http-request cache use src if is_api_endpoint
http-cacheable-request method GET url_beg /api/2023-10/ if is_api_endpoint
# Cache settings
# Cache size: 10GB
# Cache max object size: 100MB
# Cache valid: 1 hour for static, 5 minutes for API (adjust as needed)
cache total 10g
cache max-object-size 100m
cache valid 1h if is_static_asset
cache valid 5m if is_api_endpoint
# Backend to forward requests to Shopify
# In this setup, HAProxy acts as a caching proxy, not a load balancer for multiple backends.
# The 'backend shopify_origin' will simply forward uncached requests.
default_backend shopify_origin
backend shopify_origin
mode http
# Use Shopify's domain as the origin.
# IMPORTANT: This requires DNS resolution for shopify.com or your custom domain pointing to Shopify.
# For custom domains, ensure your DNS is set up correctly to point to Shopify's servers.
# If using a custom domain, you might need to set the Host header explicitly.
# For direct Shopify domain, this is usually not needed.
# server shopify_server shopify.com:443 check ssl verify none # Use with caution, prefer proper cert validation
# If using a custom domain, replace 'your-store.myshopify.com' with your actual Shopify store URL.
# The 'Host' header is crucial for Shopify to route the request correctly.
server shopify_server your-store.myshopify.com:443 check ssl verify required ca-file /etc/ssl/certs/ca-certificates.crt
http-request set-header Host your-store.myshopify.com
http-request set-header X-Forwarded-Proto https
http-request set-header X-Forwarded-Port 443
# Example of a custom error page
# errorfile 503 /etc/haproxy/errors/503.http
Note on Cache Keys and Invalidation: For API endpoints, the cache key must be meticulously crafted. For the Storefront API, this typically involves hashing the GraphQL query and any relevant variables. For product pages, a cache key based on the product handle and potentially the selected variant is essential. Invalidation is the hardest part. Webhooks from Shopify can trigger cache purges for specific resources (e.g., product updates). For less critical data, short TTLs are a pragmatic approach. For static assets, standard HTTP cache headers (Cache-Control, Expires) should be respected, and the CDN will handle much of this.
CDN Integration: Cloudflare or OVH’s CDN
A robust CDN is non-negotiable. We’ll configure it to cache static assets aggressively and potentially cache certain API responses. The CDN acts as the first point of contact for most users, significantly reducing latency and offloading traffic from our OVH infrastructure.
Key CDN Configurations:
- Origin Shield: Configure the CDN to pull from our HAProxy instance (or directly from Shopify if HAProxy is only for specific API caching). This reduces the number of direct connections to the origin.
- Cache Rules: Set long TTLs for static assets (images, CSS, JS). For API responses, use shorter TTLs and leverage cache-control headers.
- Edge Caching: Ensure maximum cache hit ratio at the edge.
- Security Features: Utilize WAF, DDoS protection, and bot management.
If using Cloudflare, specific page rules or cache rules would be configured. For example:
# Cloudflare Page Rule Example (Conceptual) # URL: *your-store.com/cdn/* # Settings: Cache Level: Cache Everything # Edge Cache TTL: 1 Year # URL: *your-store.com/api/2023-10/* # Settings: Cache Level: Cache If Appears Static # Browser Cache TTL: 30 minutes # Edge Cache TTL: 5 minutes (adjust based on data volatility) # Forwarding URL: (if needed for specific routing)
If using OVH’s CDN service, similar configurations for origin, caching rules, and TTLs would be applied through their control panel.
Externalizing State and Logic: Middleware and Microservices
For highly dynamic functionalities or complex business logic that cannot be efficiently handled by Shopify’s platform or simple caching, we introduce a middleware layer. This layer sits between the CDN/load balancer and Shopify, acting as an intelligent API gateway or a set of microservices.
Building a Caching Proxy with Node.js/Express
A common pattern is to build a dedicated caching proxy using Node.js with Express and a fast in-memory cache like Redis. This proxy intercepts API requests, checks the cache, and if a miss, forwards the request to Shopify’s API, caches the response, and returns it.
// Example: Node.js/Express middleware for caching Shopify Storefront API responses
const express = require('express');
const axios = require('axios');
const Redis = require('ioredis');
const app = express();
const redisClient = new Redis({
host: 'your-redis-host', // e.g., '10.0.0.1' or 'redis.internal'
port: 6379,
password: 'your-redis-password',
db: 0,
});
const SHOPIFY_STOREFRONT_API_URL = 'https://your-store.myshopify.com/api/2023-10/graphql.json';
const SHOPIFY_API_VERSION = '2023-10';
const SHOPIFY_STOREFRONT_ACCESS_TOKEN = 'your_storefront_access_token';
// Middleware to parse JSON bodies
app.use(express.json());
// Helper function to generate a cache key from a GraphQL query and variables
function generateCacheKey(query, variables) {
// Simple key generation: hash of query + sorted variables
// For production, consider a more robust hashing mechanism
const queryHash = require('crypto').createHash('sha256').update(query).digest('hex');
const variablesString = JSON.stringify(variables, Object.keys(variables || {}).sort());
return `shopify-api:${queryHash}:${variablesString}`;
}
// Middleware to check cache before hitting Shopify API
app.use(async (req, res, next) => {
if (req.method === 'POST' && req.path === '/graphql.json') {
const { query, variables } = req.body;
const cacheKey = generateCacheKey(query, variables);
try {
const cachedResponse = await redisClient.get(cacheKey);
if (cachedResponse) {
console.log(`Cache HIT for key: ${cacheKey}`);
return res.status(200).json(JSON.parse(cachedResponse));
}
} catch (error) {
console.error('Redis GET error:', error);
// Continue to API if Redis fails
}
}
next(); // Proceed to the next middleware or route handler
});
// Route to proxy Shopify Storefront API requests
app.post('/graphql.json', async (req, res) => {
const { query, variables } = req.body;
const cacheKey = generateCacheKey(query, variables);
try {
const response = await axios.post(SHOPIFY_STOREFRONT_API_URL, {
query,
variables,
}, {
headers: {
'Content-Type': 'application/json',
'X-Shopify-Storefront-Access-Token': SHOPIFY_STOREFRONT_ACCESS_TOKEN,
'Accept': `application/json; api-version=${SHOPIFY_API_VERSION}`,
},
});
// Cache the successful response
if (response.data && !response.data.errors) {
const ttl = 300; // Cache for 5 minutes (300 seconds) - adjust as needed
try {
await redisClient.set(cacheKey, JSON.stringify(response.data), 'EX', ttl);
console.log(`Cache SET for key: ${cacheKey} with TTL: ${ttl}`);
} catch (error) {
console.error('Redis SET error:', error);
}
}
res.status(response.status).json(response.data);
} catch (error) {
console.error('Shopify API request error:', error.response ? error.response.data : error.message);
if (error.response) {
res.status(error.response.status).json(error.response.data);
} else {
res.status(500).json({ error: 'Internal Server Error' });
}
}
});
// Basic health check endpoint
app.get('/health', (req, res) => {
res.status(200).send('OK');
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`Shopify API Caching Proxy running on port ${PORT}`);
});
This Node.js application would be deployed on OVH servers (e.g., using their Public Cloud instances or Kubernetes service) and exposed via the HAProxy load balancer. The HAProxy configuration would then point to this Node.js application for API requests instead of directly to Shopify.
Leveraging Redis for Session Management and Rate Limiting
Redis is invaluable for managing shared state across potentially multiple instances of our middleware or custom applications. This includes:
- Session Storage: If any custom user sessions are managed outside of Shopify’s cookies, Redis provides a fast, centralized store.
- Rate Limiting: Implement sophisticated rate limiting per user, IP, or API key to protect both our middleware and Shopify’s API from abuse.
- Distributed Locks: For critical operations that require atomicity across multiple instances (e.g., inventory updates before checkout), distributed locks in Redis can be used.
# Example Redis commands for rate limiting (Lua script for atomicity)
# Script to increment a counter and set an expiry if it's the first increment
# KEYS[1]: The key for the counter (e.g., 'rate_limit:user_id:endpoint')
# ARGV[1]: The maximum number of requests allowed
# ARGV[2]: The time-to-live (TTL) in seconds for the counter
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local ttl = tonumber(ARGV[2])
local current = redis.call('INCR', key)
if current == 1 then
redis.call('EXPIRE', key, ttl)
end
if current > limit then
return 0 -- Exceeded limit
else
return 1 -- Within limit
end
This Lua script can be executed atomically on the Redis server using `EVAL` in various programming languages.
Database Optimization and Offloading
While Shopify handles its own database, any custom applications or middleware interacting with Shopify’s APIs will likely have their own data stores. For high-throughput scenarios, optimizing these databases is critical.
Choosing the Right Database for Custom Logic
For read-heavy workloads with complex querying needs, a PostgreSQL or MySQL database on OVH’s managed database services can be highly performant. For extreme write throughput or caching needs, consider solutions like ScyllaDB or a managed Redis cluster.
Query Optimization and Indexing
Standard database best practices apply: ensure all frequently queried columns are indexed. Use `EXPLAIN` (or `EXPLAIN ANALYZE`) to identify slow queries and optimize them. For example, if a custom application frequently fetches product details by SKU:
-- Example PostgreSQL table and indexing
CREATE TABLE custom_products (
id SERIAL PRIMARY KEY,
sku VARCHAR(50) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
-- Index for fast SKU lookups
CREATE INDEX idx_custom_products_sku ON custom_products (sku);
-- Query for a specific product by SKU
SELECT id, name, price FROM custom_products WHERE sku = 'YOUR_SKU_HERE';
Ensure the `sku` column has a B-tree index for efficient `WHERE` clause performance. For very large datasets, consider partitioning or sharding strategies if applicable to your custom data model.
Monitoring, Alerting, and Performance Tuning
A high-traffic environment demands continuous monitoring and proactive alerting. We need visibility into every layer of our stack.
Key Metrics to Monitor
- Load Balancer: Request rate, error rates (5xx, 4xx), backend connection status, response times.
- CDN: Cache hit ratio, bandwidth, request volume, origin fetch times.
- Middleware/API Proxy: Request rate, error rates, response times (internal and external API calls), CPU/memory usage, Redis latency.
- Databases: Query latency, connection count, CPU/memory usage, disk I/O, replication lag.
- Shopify API: Shopify’s own performance metrics (if available via their dashboard or status pages), and our own observed latency for API calls.
Alerting Strategy
Set up alerts for critical thresholds:
- High error rates (e.g., > 1% 5xx errors on the load balancer).
- Low CDN cache hit ratio (e.g., < 70% for static assets).
- High response times (e.g., p95 latency > 500ms for API calls).
- Resource exhaustion (CPU, memory, disk space) on middleware servers.
- Redis latency spikes.
- Shopify API rate limit warnings or errors.
Tools like Prometheus with Grafana for metrics visualization and alerting, or integrated solutions from OVH’s monitoring suite, are essential. For application-level tracing, tools like Jaeger or OpenTelemetry can provide deep insights into request flows across distributed services.
Conclusion: Iterative Scaling and Performance Testing
Scaling to 50,000+ concurrent requests is not a one-time setup but an ongoing process. Regularly perform load testing (using tools like k6, JMeter, or Locust) against your entire stack to identify bottlenecks before they impact users. Monitor Shopify’s API rate limits and adjust caching strategies or middleware logic accordingly. The combination of intelligent caching at the edge, a performant middleware layer, and robust monitoring provides the foundation for handling massive traffic volumes on platforms like OVH.