How We Audited a High-Traffic Ruby Enterprise Stack on OVH and Mitigated unsafe YAML loading allowing remote code execution
Deep Dive: Auditing a High-Traffic Ruby Enterprise Stack on OVH
This post details a critical security audit performed on a high-traffic Ruby on Rails enterprise application hosted on OVH. The primary objective was to identify and mitigate vulnerabilities, with a specific focus on unsafe deserialization patterns that could lead to Remote Code Execution (RCE).
Initial Reconnaissance and Stack Identification
The target environment was a complex Ruby on Rails monolith with several microservices, running on a mix of dedicated servers and OVH’s public cloud instances. Key components identified included:
- Ruby on Rails (versions 4.x and 5.x)
- PostgreSQL (v9.x)
- Redis (v3.x)
- Nginx (as a reverse proxy)
- Sidekiq (for background job processing)
- Custom-built Ruby gems
- Various third-party gems with known vulnerabilities
The OVH infrastructure utilized a combination of bare-metal servers for core services and virtual machines for supporting components. Network segmentation was present but not strictly enforced at the application layer, posing a risk of lateral movement.
Focus Area: Unsafe YAML Deserialization
A common attack vector in Ruby applications is the unsafe loading of YAML data. The `YAML.load` method, when used with untrusted input, can execute arbitrary Ruby code. This is particularly dangerous when data is loaded from user-controlled sources, such as API requests, cookies, or even files uploaded by users.
Our audit specifically looked for instances where `YAML.load` was being used without proper sanitization or validation of the input source. This often occurs in older codebases or when developers are not fully aware of the security implications.
Identifying Vulnerable Code Patterns
We employed a combination of static analysis tools and manual code review to pinpoint potential vulnerabilities. Tools like Brakeman were invaluable for an initial scan, but manual inspection was crucial for understanding context and identifying subtle flaws.
A typical vulnerable pattern looks like this:
Example of Vulnerable Code
Consider a controller action that accepts a YAML payload:
# app/controllers/api/v1/data_controller.rb
class Api::V1::DataController < ApplicationController
def create
# WARNING: Unsafe YAML loading from request body
data = YAML.load(request.body.read)
process_data(data)
render json: { status: "success" }
end
private
def process_data(data)
# ... business logic ...
end
end
An attacker could craft a malicious YAML payload to achieve RCE. For instance, they could send a request with a `Content-Type: application/yaml` header and a body like this:
Malicious YAML Payload Example
--- !ruby/object:Process # This payload executes `ls -la /` on the server args: - -c - "ls -la / > /tmp/rce_output.txt" set_id: true uid: 0 gid: 0
When `YAML.load` processes this, it instantiates a `Process` object and executes the provided command. This is a classic exploit pattern.
Mitigation Strategy: Safe YAML Loading
The recommended and safest approach is to use `YAML.safe_load` (available in Psych, the default YAML parser in Ruby 2.0+). `YAML.safe_load` restricts the types of objects that can be deserialized, preventing the instantiation of arbitrary classes and the execution of malicious code.
Implementing Safe Loading
The vulnerable code snippet above should be refactored to use `YAML.safe_load`:
# app/controllers/api/v1/data_controller.rb
class Api::V1::DataController < ApplicationController
def create
# Safely load YAML data
data = YAML.safe_load(request.body.read, symbolize_names: true) # Added symbolize_names for consistency
process_data(data)
render json: { status: "success" }
rescue Psych::SyntaxError => e
render json: { error: "Invalid YAML format: #{e.message}" }, status: :bad_request
rescue StandardError => e
render json: { error: "An unexpected error occurred: #{e.message}" }, status: :internal_server_error
end
private
def process_data(data)
# ... business logic ...
# Ensure data is of expected types after safe loading
unless data.is_a?(Hash) && data.key?(:user_id) && data.key?(:payload)
raise ArgumentError, "Invalid data structure received"
end
end
end
Key improvements:
- Replaced `YAML.load` with `YAML.safe_load`.
- Added `symbolize_names: true` for consistent key handling (e.g., `:user_id` instead of `”user_id”`).
- Included error handling for `Psych::SyntaxError` to gracefully manage malformed YAML.
- Added a basic type check within `process_data` to ensure the deserialized data conforms to expected structures, further hardening against unexpected inputs.
Beyond YAML: Broader Security Audit Findings
While the YAML vulnerability was critical, the audit uncovered other areas requiring attention:
1. Outdated Dependencies
A significant number of gems were outdated, some with known CVEs. We used `bundle outdated` and then cross-referenced with security advisories.
# Check for outdated gems bundle outdated # Example of a gem with a known vulnerability (hypothetical) # gem 'nokogiri', '1.8.0' # Vulnerable to CVE-2020-XXXX # Mitigation: Update to the latest secure version # Gemfile # gem 'nokogiri', '>= 1.10.9' # Or a specific secure version
A systematic update process, including thorough testing, was implemented. For critical applications, pinning to specific secure versions is often preferred over using broad version ranges.
2. Insecure Direct Object References (IDOR)
Several API endpoints allowed access to resources without proper authorization checks. For example, fetching user data using a user ID directly from the URL parameter without verifying if the current logged-in user had permission to view that specific user’s data.
# app/controllers/api/v1/users_controller.rb (Vulnerable)
class Api::V1::UsersController < ApplicationController
def show
user = User.find(params[:id]) # No authorization check
render json: user
end
end
# app/controllers/api/v1/users_controller.rb (Mitigated)
class Api::V1::UsersController < ApplicationController
before_action :authenticate_user!
before_action :set_user, only: [:show, :edit, :update]
after_action :verify_authorized, only: [:show, :edit, :update]
def show
render json: @user
end
private
def set_user
@user = User.find(params[:id])
authorize @user # Using Pundit or similar gem for authorization
end
end
Implementing a robust authorization framework (like Pundit or CanCanCan) and ensuring all sensitive actions are protected by `before_action` filters is crucial.
3. Insufficient Input Validation
Beyond YAML, other forms of input were not sufficiently validated, leading to potential injection attacks (SQL, command). We enforced strong validation using Rails’ built-in Active Record validations and custom validators.
# app/models/post.rb
class Post < ApplicationRecord
validates :title, presence: true, length: { maximum: 255 }
validates :body, presence: true
validates :author_email, format: { with: URI::MailTo::EMAIL_REGEXP } # Basic email format validation
end
# app/controllers/posts_controller.rb
class PostsController < ApplicationController
def create
@post = Post.new(post_params)
if @post.save
redirect_to @post, notice: 'Post was successfully created.'
else
render :new
end
end
private
def post_params
params.require(:post).permit(:title, :body, :author_email) # Strong parameters
end
end
Strong parameters are essential to prevent mass assignment vulnerabilities and ensure only permitted attributes are processed.
OVH Specific Considerations
The OVH environment presented specific challenges and opportunities:
1. Network Security Groups and Firewalls
Ensuring that OVH’s network security groups and firewall rules were correctly configured to restrict access to only necessary ports and IP ranges was a priority. For instance, PostgreSQL should only be accessible from application servers, not the public internet.
# Example: OVH Public Cloud Security Group Rule (Conceptual) # Allow traffic on port 5432 (PostgreSQL) only from specific internal IP ranges ACTION: ALLOW PROTOCOL: TCP DIRECTION: INBOUND PORT: 5432 SOURCE_IP: 10.0.0.0/16 # Internal application subnet
2. Server Hardening
Dedicated servers require manual hardening. This included disabling unnecessary services, configuring `sshd` securely, and ensuring regular security patching.
# Example: Secure SSH configuration # /etc/ssh/sshd_config PermitRootLogin no PasswordAuthentication no PubkeyAuthentication yes AllowUsers your_user admin_user MaxAuthTries 3 LoginGraceTime 30s ClientAliveInterval 300 ClientAliveCountMax 2 UsePAM yes
3. Logging and Monitoring
Centralized logging (e.g., using ELK stack or OVH’s logging services) and robust monitoring were implemented. This is crucial for detecting suspicious activity, especially after mitigating RCE vulnerabilities.
# Example: Nginx access log format to capture relevant details
# /etc/nginx/nginx.conf
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'"$request_body"'; # Including request body for deeper analysis (use with caution for PII)
access_log /var/log/nginx/access.log main;
Care must be taken when logging request bodies due to potential PII and performance implications. Often, logging only specific headers or a sanitized version of the body is more practical.
Conclusion and Ongoing Security Posture
The audit successfully identified and mitigated a critical RCE vulnerability stemming from unsafe YAML loading, along with other significant security weaknesses. The mitigation involved replacing `YAML.load` with `YAML.safe_load`, updating dependencies, enforcing authorization, and strengthening input validation.
Maintaining a strong security posture requires continuous effort. This includes regular audits, automated vulnerability scanning, dependency management, and ongoing security training for development teams. The OVH infrastructure, while powerful, demands diligent configuration and maintenance to ensure security at all layers.