• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 9+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » High-Throughput Caching Strategies: Scaling MySQL for Ruby Application APIs

High-Throughput Caching Strategies: Scaling MySQL for Ruby Application APIs

Benchmarking Baseline MySQL Performance

Before implementing any caching, a thorough understanding of the baseline performance is critical. This involves simulating realistic read loads against your MySQL database to identify bottlenecks. We’ll use sysbench for this purpose, focusing on its oltp-read-only workload.

Ensure you have sysbench installed on a separate machine from your database server to avoid resource contention. The test database should mirror your production schema and data volume as closely as possible. We’ll configure sysbench to use 16 threads, simulating a moderate concurrent read load.

Configuring Sysbench for Read-Only Load Testing

First, prepare the test database. This involves creating a schema and populating it with data. For a typical API scenario, a table with indexed columns that are frequently queried is representative.

Schema Preparation

Let’s assume a simple `products` table, common in e-commerce APIs.

SQL Schema Definition

CREATE TABLE products (
    id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    price DECIMAL(10, 2) NOT NULL,
    category_id INT UNSIGNED,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    KEY idx_category_id (category_id)
);

Sysbench OLTP-Read-Only Preparation

The following command prepares sysbench with 1 million rows in the `products` table. Adjust the `–tables` and `–table-size` parameters based on your production data scale.

Sysbench Preparation Command

sysbench oltp_read_only --db-driver=mysql \
--mysql-host=your_mysql_host \
--mysql-port=3306 \
--mysql-user=sysbench_user \
--mysql-password=sysbench_password \
--mysql-db=api_test_db \
--tables=1 \
--table-size=1000000 \
--threads=16 \
prepare

After preparation, run the read-only test for 60 seconds. Monitor the output for transactions per second (TPS) and query latency.

Sysbench Read-Only Test Command

sysbench oltp_read_only --db-driver=mysql \
--mysql-host=your_mysql_host \
--mysql-port=3306 \
--mysql-user=sysbench_user \
--mysql-password=sysbench_password \
--mysql-db=api_test_db \
--tables=1 \
--table-size=1000000 \
--threads=16 \
--time=60 \
run

Note down the average TPS and the 95th percentile latency. This is your baseline. For instance, if you observe low TPS and high latency, it indicates a database bottleneck under read load.

Implementing In-Memory Caching with Redis

Redis is an excellent choice for caching frequently accessed, relatively static data. For API endpoints that retrieve product details or lists, caching these responses can dramatically reduce database load.

Redis Setup and Configuration

A basic Redis setup is sufficient for caching. Ensure it’s accessible from your application servers. For production, consider Redis Sentinel for high availability or Redis Cluster for sharding.

Ruby Application Integration (Rails Example)

We’ll use the redis-rb gem to interact with Redis. The strategy involves checking Redis for cached data before querying MySQL. If data is found, return it; otherwise, fetch from MySQL, cache it in Redis, and then return it.

Gemfile Addition

# Gemfile
gem 'redis'

Redis Client Initialization

# config/initializers/redis.rb
$redis = Redis.new(host: 'your_redis_host', port: 6379, db: 0)

Caching Product Index Endpoint

# app/controllers/products_controller.rb
class ProductsController < ApplicationController
  def index
    cache_key = "products:all"
    cached_products = $redis.get(cache_key)

    if cached_products
      render json: JSON.parse(cached_products)
    else
      @products = Product.all # Your ActiveRecord query
      # Cache for 1 hour (3600 seconds)
      $redis.setex(cache_key, 3600, @products.to_json)
      render json: @products
    end
  end

  def show
    product_id = params[:id]
    cache_key = "products:#{product_id}"
    cached_product = $redis.get(cache_key)

    if cached_product
      render json: JSON.parse(cached_product)
    else
      @product = Product.find(product_id) # Your ActiveRecord query
      # Cache for 1 hour (3600 seconds)
      $redis.setex(cache_key, 3600, @product.to_json)
      render json: @product
    end
  end
end

Benchmarking with Redis Caching

Rerun the sysbench test against your MySQL instance. You should observe a significant increase in TPS and a decrease in latency, especially for queries that are now served by Redis. The key is that the sysbench test still hits MySQL, but in a real application, Redis would intercept these reads.

To accurately measure the impact of Redis, you’d ideally instrument your application to track cache hit/miss rates and measure end-to-end API response times. However, observing reduced MySQL CPU/IO utilization during peak loads is a strong indicator of caching effectiveness.

Advanced Caching: Query Result Caching with Memcached

While Redis excels at object caching, Memcached is often preferred for raw query result caching due to its simplicity and performance characteristics for this specific use case. It’s a pure in-memory key-value store optimized for speed.

Memcached Setup

Similar to Redis, Memcached requires a running instance accessible by your application servers. For high availability, consider running multiple Memcached instances and using a consistent hashing algorithm in your application to distribute keys.

Ruby Application Integration (Rails Example)

We’ll use the dalli gem for Memcached integration.

Gemfile Addition

# Gemfile
gem 'dalli'

Memcached Client Initialization

# config/initializers/memcached.rb
MEMCACHED_SERVERS = ['your_memcached_host:11211']
$memcached = Dalli::Client.new(MEMCACHED_SERVERS, {
  namespace: 'api_cache',
  compress: true # Enable compression for larger values
})

Caching Specific SQL Queries

# app/models/product.rb
class Product < ApplicationRecord
  # ... other model code

  def self.find_by_category_cached(category_id)
    cache_key = "products:category:#{category_id}"
    cached_data = $memcached.get(cache_key)

    if cached_data
      # Deserialize if necessary, assuming JSON for simplicity
      return JSON.parse(cached_data)
    else
      # Execute the actual query
      products = Product.where(category_id: category_id).to_a
      # Cache for 15 minutes (900 seconds)
      $memcached.set(cache_key, products.to_json, 900)
      return products
    end
  end

  def self.find_expensive_products_cached(limit: 10)
    cache_key = "products:expensive:#{limit}"
    cached_data = $memcached.get(cache_key)

    if cached_data
      return JSON.parse(cached_data)
    else
      # Example of a more complex query
      products = Product.order(price: :desc).limit(limit).to_a
      $memcached.set(cache_key, products.to_json, 900)
      return products
    end
  end
end

# In your controller:
# class ProductsController < ApplicationController
#   def by_category
#     category_id = params[:category_id]
#     @products = Product.find_by_category_cached(category_id)
#     render json: @products
#   end
# end

Cache Invalidation Strategies

The biggest challenge with caching is invalidation. When data in MySQL changes, the corresponding cache entries must be updated or removed. For the examples above:

  • Product Update/Delete: After updating or deleting a product in MySQL, explicitly delete the relevant cache keys from Memcached (e.g., products:ID, products:category:CATEGORY_ID).
  • Bulk Updates: For operations that affect multiple cached items (e.g., updating all products in a category), consider a “flush” strategy for that specific cache key prefix or a slightly longer cache TTL (Time To Live) to tolerate stale data for a short period.

Example: Cache Invalidation on Save

# app/models/product.rb
class Product < ApplicationRecord
  after_save :clear_product_caches
  after_destroy :clear_product_caches

  private

  def clear_product_caches
    # Invalidate specific product cache
    $memcached.delete("products:#{self.id}")
    # Invalidate category cache if category changed or product was deleted/updated
    $memcached.delete("products:category:#{self.category_id}")
    # Potentially invalidate other related caches (e.g., expensive products if price changed)
    # This can become complex and might require a more robust invalidation system.
  end
end

Database-Level Caching: Query Cache (Deprecated) and Buffer Pool

While application-level caching is generally more flexible and scalable, understanding MySQL’s internal caching mechanisms is crucial for holistic performance tuning.

MySQL Query Cache (Deprecated in 5.7, Removed in 8.0)

Historically, MySQL had a built-in query cache that stored the text of a SELECT statement together with the result set. It was effective for read-heavy workloads with identical queries but suffered from significant overhead due to invalidation on any table modification. Its performance was often a net negative in busy systems. If you are on an older version (pre-5.7), you might encounter it. It’s generally recommended to disable it.

Disabling Query Cache (if applicable)

[mysqld]
query_cache_type = 0
query_cache_size = 0

Restart MySQL after making these changes.

InnoDB Buffer Pool

The InnoDB Buffer Pool is MySQL’s most critical cache. It caches data and indexes for InnoDB tables in memory. A well-sized buffer pool is paramount for performance. The goal is to have as much of your active dataset as possible reside in the buffer pool.

Tuning `innodb_buffer_pool_size`

A common recommendation is to set innodb_buffer_pool_size to 70-80% of your available RAM on a dedicated database server. Monitor buffer pool hit rate using `SHOW ENGINE INNODB STATUS;`.

Monitoring Buffer Pool Hit Rate

SHOW ENGINE INNODB STATUS\G

Look for the BUFFER POOL AND MEMORY section. The Buffer pool hit rate should ideally be above 99%. If it’s lower, consider increasing innodb_buffer_pool_size (if you have RAM available) or optimizing queries to access data more efficiently.

Choosing the Right Caching Strategy

The optimal caching strategy depends on your application’s access patterns and data volatility:

  • Object Caching (Redis): Ideal for caching entire objects or collections of objects that are frequently read and infrequently modified. Think user profiles, product details, configuration settings.
  • Query Result Caching (Memcached): Suitable for caching the results of specific, often complex, SQL queries. Useful when direct object mapping is less straightforward or when you need to cache raw tabular data.
  • Database Buffer Pool (InnoDB): Essential for all InnoDB tables. Maximize its size to keep frequently accessed data and indexes in memory.

A multi-layered approach, combining application-level caching (Redis/Memcached) with a well-tuned InnoDB buffer pool, provides the most robust solution for scaling MySQL-backed APIs under high read loads.

Primary Sidebar

A little about the Author

Having 9+ Years of Experience in Software Development.
Expertised in Php Development, WordPress Custom Theme Development (From scratch using underscores or Genesis Framework or using any blank theme or Premium Theme), Custom Plugin Development. Hands on Experience on 3rd Party Php Extension like Chilkat, nSoftware.

Recent Posts

  • Step-by-Step: Diagnosing thread pools deadlock during concurrent ActiveRecord transaction processing on Linode Servers
  • Securing Your E-commerce APIs: Preventing SQL Injection (SQLi) in customized checkout queries in WooCommerce Implementations
  • Disaster Recovery 101: Architecting Auto-Failovers for MySQL and Ruby Deployments on Linode
  • High-Throughput Caching Strategies: Scaling MySQL for Perl Application APIs
  • Disaster Recovery 101: Architecting Auto-Failovers for DynamoDB and Laravel Deployments on DigitalOcean

Copyright © 2026 · Vinay Vengala