Tuning RHEL 9 Network Interface Card (NIC) Rings and Transmit Queues (txqueuelen) for Ultra-Low Latency APIs

Understanding NIC Ring Buffers and Transmit Queues

For enterprise applications demanding ultra-low latency, particularly those serving high-frequency trading platforms, real-time analytics, or critical API endpoints, the performance of the network interface card (NIC) is paramount. Two key, often overlooked, tuning parameters directly impacting network throughput and latency are the NIC’s receive (RX) and transmit (TX) ring buffer sizes, and the per-queue transmit queue length (txqueuelen). These parameters dictate how many network packets the NIC can buffer before handing them off to the kernel or before the kernel can send them out. Insufficient buffer sizes can lead to packet drops (RX drops) or increased latency as packets wait for transmission slots.

Assessing Current NIC Configuration

Before tuning, it’s crucial to understand the current state of your network interfaces. We’ll use the ethtool utility, a standard Linux tool for configuring and displaying information about network drivers and hardware settings.

Displaying NIC Information

To view general information, including the driver and supported features, use:

sudo ethtool -i eth0

To inspect the current ring buffer settings (RX and TX descriptors) and other offload capabilities:

sudo ethtool -g eth0

The output will show:

Ring parameters: This section details the current number of RX/TX descriptors. These descriptors are essentially pointers to memory buffers where packets are stored.
Pre-sleep and Post-sleep: These indicate how many descriptors are processed before and after the NIC enters a low-power state.
RX/TX-usecs: The time in microseconds the NIC spends processing descriptors before yielding to the CPU.
RX/TX-frames: The number of frames processed before yielding.

To view the transmit queue length (txqueuelen) for an interface:

ip link show eth0

Look for the txqlen value in the output. This is the maximum number of packets that can be queued for transmission on this interface before packets are dropped.

Tuning Ring Buffers with `ethtool`

Increasing the number of RX and TX descriptors can significantly reduce packet drops under heavy load, as it provides more buffer space for incoming and outgoing packets. The optimal values depend heavily on the NIC hardware, driver, and workload. A common starting point for high-performance scenarios is to increase these values substantially.

Setting Ring Buffer Sizes

Use the -G option with ethtool to set the ring parameters. For example, to set RX and TX descriptors to 4096 each:

sudo ethtool -G eth0 rx 4096 tx 4096

Important Considerations:

Memory Consumption: Each descriptor consumes memory. Very large values can lead to excessive RAM usage. Monitor system memory after changes.
Driver Support: Not all drivers support arbitrary values. Consult your NIC vendor’s documentation.
Hardware Limitations: The NIC itself has a finite number of descriptor rings it can manage.
Workload Specificity: For very high packet rates (e.g., millions of packets per second), you might need even larger values. For latency-sensitive applications with moderate throughput, excessively large buffers might introduce slight delays due to cache effects.

Making Ring Buffer Changes Persistent

Changes made with ethtool are not persistent across reboots. On RHEL 9, the recommended way to make these changes persistent is by using systemd-networkd or by creating a custom ethtool service. Using systemd-networkd is generally preferred for modern systems.

Using `systemd-networkd`

Create or edit a network configuration file for your interface (e.g., /etc/systemd/network/10-eth0.network). Add an [Link] section with the MTUBytes and WakeOnLan (if needed) options. For ethtool specific settings, you’ll typically use a .link file.

Create a file named /etc/systemd/network/eth0.link (or similar, matching your interface name):

[Match]
Name=eth0

[Link]
# Example: Set RX/TX descriptors to 4096
# Note: ethtool options are often specified directly or via a script.
# For ethtool specific settings, a systemd service is more robust.
# However, some basic link properties can be set here.
# For advanced ethtool settings, a systemd service is recommended.
# Example for MTU if needed:
# MTUBytes=1500
]

A more robust approach for ethtool settings is to create a systemd service that runs ethtool commands after the network is up.

Create a service file, e.g., /etc/systemd/system/ethtool-eth0.service:

[Unit]
Description=Set ethtool options for eth0
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/sbin/ethtool -G eth0 rx 4096 tx 4096
ExecStop=/usr/sbin/ethtool -G eth0 rx 1024 tx 1024 # Optional: revert to defaults on stop

[Install]
WantedBy=multi-user.target
]

Then, enable and start the service:

sudo systemctl enable ethtool-eth0.service
sudo systemctl start ethtool-eth0.service

Verify the settings with ethtool -g eth0 after starting the service.

Tuning Transmit Queue Length (`txqueuelen`)

The txqueuelen parameter controls the maximum number of packets that can be buffered in the transmit queue for a given interface. A small txqueuelen can lead to packet drops if the application sends data faster than the network can transmit it, even if the NIC’s TX ring buffers are not full. Conversely, a very large txqueuelen can increase latency by holding packets in the kernel queue longer than necessary.

Setting `txqueuelen`

You can adjust txqueuelen using the ip command:

sudo ip link set dev eth0 txqueuelen 1000

The default value is often 1000. For ultra-low latency, you might consider a value that balances throughput and latency. For some high-frequency trading scenarios, a smaller value (e.g., 250-500) might be preferred to minimize queuing delays, assuming the NIC ring buffers and CPU can keep up. For general high-throughput, increasing it slightly might be beneficial.

Making `txqueuelen` Changes Persistent

Similar to ring buffers, txqueuelen changes are not persistent. The systemd-networkd approach is again recommended.

Using `systemd-networkd` (`.network` file)

Edit your interface’s .network file (e.g., /etc/systemd/network/10-eth0.network) and add the TxQueueLen option under the [Link] section:

[Match]
Name=eth0

[Network]
DHCP=no
Address=192.168.1.100/24
Gateway=192.168.1.1
DNS=8.8.8.8

[Link]
# Set txqueuelen to 1000
TxQueueLen=1000
]

After saving the file, restart systemd-networkd:

sudo systemctl restart systemd-networkd

Verify the change with ip link show eth0.

Advanced Considerations: Interrupt Coalescing and CPU Affinity

While ring buffers and txqueuelen are critical, achieving ultra-low latency often requires a holistic approach. Two other key areas are interrupt coalescing and CPU affinity.

Interrupt Coalescing

Interrupt coalescing is a mechanism where the NIC delays sending an interrupt to the CPU until a certain number of packets have arrived or a specific time interval has passed. This reduces the CPU overhead per packet but can increase latency. For low-latency applications, you often want to disable or significantly reduce interrupt coalescing.

Tuning Interrupt Coalescing

Use ethtool -c to view and modify interrupt coalescing settings:

sudo ethtool -c eth0

To disable coalescing (set to 0):

sudo ethtool -C eth0 adaptive-rx off adaptive-tx off rx-usecs 0 tx-usecs 0

Note: The exact parameters and their availability depend on the NIC driver. adaptive-rx/tx and rx/tx-usecs are common. Setting rx-usecs and tx-usecs to 0 effectively disables coalescing. Persistence for these settings also requires a systemd service as described earlier.

CPU Affinity

To minimize cache misses and context switching overhead, it’s highly beneficial to bind network interrupt handlers and application threads to specific CPU cores. This is known as CPU affinity.

Setting CPU Affinity

This is typically managed through kernel boot parameters (e.g., isolcpus) and application-specific configurations or systemd service files. For network interrupts, you can often map IRQs to specific CPUs. First, find the IRQ for your NIC:

grep eth0 /proc/interrupts

Then, you can manually assign the IRQ to a CPU core (e.g., core 4) by writing to /proc/irq/<IRQ_NUMBER>/smp_affinity. For persistence, use irqbalance configuration or a systemd service.

For application threads, use the taskset command or libraries within your application framework to pin threads to specific cores. For example, to run a process on CPU core 4:

taskset -c 4 your_application_command

Monitoring and Validation

After applying these tuning parameters, continuous monitoring is essential. Key metrics to watch include:

Packet Drops: Use netstat -s | grep -i 'packet dropped' or ip -s link show eth0 to identify interface-level drops.
Latency: Employ tools like ping (for basic RTT), sockperf, or application-level latency measurements.
Throughput: Use iperf3 or application-specific metrics.
CPU Usage: Monitor CPU load, especially on cores handling network traffic, using top, htop, or sar.
NIC Statistics: ethtool -S eth0 provides detailed hardware-level statistics, including various types of errors and drops.

Iteratively adjust the ring buffer sizes, txqueuelen, and interrupt coalescing settings based on observed performance and application requirements. The goal is to find the sweet spot that minimizes packet loss and latency without introducing excessive CPU overhead or memory consumption.

Tuning RHEL 9 Network Interface Card (NIC) Rings and Transmit Queues (txqueuelen) for Ultra-Low Latency APIs

Understanding NIC Ring Buffers and Transmit Queues

Assessing Current NIC Configuration

Displaying NIC Information

Tuning Ring Buffers with ethtool

Setting Ring Buffer Sizes

Making Ring Buffer Changes Persistent

Using systemd-networkd

Tuning Transmit Queue Length (txqueuelen)

Setting txqueuelen

Making txqueuelen Changes Persistent

Using systemd-networkd (.network file)

Advanced Considerations: Interrupt Coalescing and CPU Affinity

Interrupt Coalescing

Tuning Interrupt Coalescing

CPU Affinity

Setting CPU Affinity

Monitoring and Validation

Reader Interactions

Leave a Reply Cancel reply

Recent Posts

Top Categories

Our Products

Our Services