• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 12+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » Step-by-Step: Diagnosing thread exhaustion and asyncio event loop delays under heavy IO loads on DigitalOcean Servers

Step-by-Step: Diagnosing thread exhaustion and asyncio event loop delays under heavy IO loads on DigitalOcean Servers

Identifying Thread Exhaustion with `top` and `htop`

Under heavy I/O loads, especially on DigitalOcean droplets, applications can suffer from thread exhaustion. This manifests as slow response times, increased latency, and potentially application unresponsiveness. The first step in diagnosing this is to get a real-time view of system processes and their resource utilization. Tools like `top` and `htop` are invaluable here.

Start by SSHing into your DigitalOcean server and running `top`. Pay close attention to the `%CPU` and `%MEM` columns for your application’s processes. More importantly, observe the `Tasks` line at the top. Look for a high number of `running` tasks and a low number of `sleeping` tasks. A significant number of tasks in the `D` state (uninterruptible sleep, often due to I/O) is a strong indicator of I/O-bound issues leading to thread blocking.

For a more user-friendly and detailed view, `htop` is often preferred. If it’s not installed, you can typically install it with `sudo apt update && sudo apt install htop` (for Debian/Ubuntu based systems) or `sudo yum install htop` (for CentOS/RHEL based systems).

Once `htop` is running, you can sort by different columns. Press `F6` and select `TIME+` to see processes that have consumed the most CPU time. Crucially, press `F5` to toggle the tree view. This allows you to see parent-child relationships between processes, which can be helpful if your application spawns many worker threads or processes. Look for your application’s main process and its associated threads. If you see a large number of threads for a single process, and many of them are in a `D` state (indicated by the `S` column showing `D`), this is a strong signal of thread exhaustion due to I/O waits.

Analyzing `asyncio` Event Loop Delays with `uvloop` and `aiomonitor`

For Python applications leveraging `asyncio`, thread exhaustion isn’t the primary concern; instead, it’s event loop blocking. When the event loop gets stuck processing a long-running synchronous operation or a very slow I/O call, the entire asynchronous application grinds to a halt. This is often exacerbated by heavy I/O loads.

If you’re not already using `uvloop`, consider it. It’s a drop-in replacement for the default `asyncio` event loop, implemented in Cython, and generally offers significant performance improvements, especially under high concurrency. Installation is straightforward:

pip install uvloop
python -m uvloop --install

To enable it in your application, you typically do:

import asyncio
import uvloop

asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())

# ... your asyncio application code ...

To diagnose event loop blocking, `aiomonitor` is an excellent tool. It provides a way to inspect the `asyncio` event loop remotely or locally. First, install it:

pip install aiomonitor

Then, integrate it into your application. A common pattern is to start `aiomonitor` on a separate thread or process, listening on a specific port. This allows you to connect to it even when the main event loop is blocked.

import asyncio
import uvloop
import aiomonitor
import threading

asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())

async def slow_io_task():
    print("Starting slow I/O task...")
    # Simulate a blocking I/O operation
    await asyncio.sleep(10) # In a real scenario, this would be a network request or disk read
    print("Slow I/O task finished.")

async def main():
    # Start aiomonitor on a separate thread
    monitor_thread = threading.Thread(target=aiomonitor.start_monitor, kwargs={'port': 5000})
    monitor_thread.daemon = True
    monitor_thread.start()
    print("aiomonitor started on port 5000")

    await asyncio.gather(
        slow_io_task(),
        # Add other tasks here
    )

if __name__ == "__main__":
    asyncio.run(main())

With the application running, you can connect to the `aiomonitor` port (default is 5000) using `telnet` or `nc` from another terminal:

telnet localhost 5000

Once connected, you’ll see a Python REPL. You can then inspect the event loop. A key command is `loop.get_debug()`. If debugging is enabled, you might see warnings about slow callbacks. More importantly, you can inspect the running tasks. If the event loop is blocked, you’ll notice that tasks that should be making progress are stuck. You can also try to manually run a small coroutine to see if the loop is responsive.

Profiling I/O Operations with `strace` and `iotop`

When `top` or `htop` indicate processes are stuck in uninterruptible sleep (`D` state), or when `aiomonitor` reveals a sluggish event loop, the next step is to pinpoint the exact I/O operations causing the bottleneck. `strace` and `iotop` are essential for this.

strace allows you to trace system calls and signals. To use it, you need the Process ID (PID) of your application. You can find this using `pgrep ` or by looking at `top`/`htop`. Then, attach `strace` to the running process:

sudo strace -p  -s 1024 -f -tt

Explanation of flags:

  • -p <PID>: Attach to the specified process ID.
  • -s 1024: Set the maximum string size to display (useful for seeing file paths or network addresses).
  • -f: Trace child processes and threads. Crucial for multi-threaded applications.
  • -tt: Print microsecond-resolution timestamps. This helps identify the duration of system calls.

Observe the output. Look for system calls that are taking an unusually long time to return, especially those related to file I/O (e.g., read, write, fsync) or network I/O (e.g., recvfrom, sendto, connect). If you see repeated calls to a specific I/O operation that is slow, that’s your culprit. For example, a long-running read() on a slow disk or a network socket that’s not responding.

iotop provides a real-time view of disk I/O usage by processes, similar to how `top` shows CPU usage. Install it if necessary:

sudo apt update && sudo apt install iotop

Run it with root privileges:

sudo iotop

iotop will show you which processes are performing the most disk reads and writes. Look for your application’s process. If it’s consistently at the top with high I/O rates, and this correlates with performance degradation, it confirms a disk I/O bottleneck. You can also see if other processes are unexpectedly consuming significant disk bandwidth, potentially starving your application.

System-Level Tuning and DigitalOcean Specifics

Once the bottleneck is identified (e.g., slow disk I/O, network saturation, or excessive thread creation), system-level tuning might be necessary. On DigitalOcean, consider the following:

Disk I/O: If `iotop` shows high disk I/O wait times and your droplet is on a standard SSD, consider upgrading to a droplet with NVMe SSDs for significantly better performance. For very I/O-intensive workloads, DigitalOcean’s Block Storage volumes can offer more predictable performance than instance-attached storage, though they introduce network latency.

Network: Ensure your application isn’t saturating the network interface. Use tools like `iftop` or `nload` to monitor bandwidth usage. If your application is making a very large number of small network requests, consider batching them or using connection pooling. For outbound traffic, check if your droplet plan has bandwidth limits.

File Descriptors: A common cause of thread exhaustion, especially in older or poorly managed applications, is running out of file descriptors. Check the current limit with ulimit -n. If it’s low (e.g., 1024), you might need to increase it. Edit /etc/security/limits.conf and add lines like:

* soft nofile 65536
* hard nofile 65536

You’ll need to log out and log back in for these changes to take effect. You can verify the new limit for the current session with ulimit -n.

Kernel Tuning (sysctl): For network-bound applications, tuning kernel parameters can sometimes help. For example, increasing TCP buffer sizes or tweaking connection tracking limits. Be cautious with these changes, as incorrect settings can degrade performance. A common parameter to check is net.core.somaxconn, which controls the maximum number of pending connections. Increase it if you’re seeing connection refused errors under load:

sudo sysctl -w net.core.somaxconn=4096

To make this persistent across reboots, add it to /etc/sysctl.conf.

Conclusion: A Systematic Approach

Diagnosing thread exhaustion and `asyncio` event loop delays under heavy I/O loads requires a systematic approach. Start with high-level system monitoring tools like `top` and `htop` to identify the symptoms. For `asyncio` applications, `aiomonitor` provides deep insights into event loop behavior. Then, drill down into specific I/O operations using `strace` and `iotop`. Finally, consider system-level configurations and DigitalOcean-specific droplet types or storage options to address the root cause. Remember that performance tuning is an iterative process; make one change at a time and measure its impact.

Primary Sidebar

A little about the Author

Having 12+ Years of Experience in Software Development, Vinay is a principal software architect, senior systems engineer, and elite technical consultant. He specializes in bespoke PHP/WordPress development, high-performance Magento 2 & Shopify architectures, custom plugin/theme development from scratch, and legacy code modernization (including VB6, VB.NET, PyQt, and Crystal Reports). Known for solving complex database bottlenecks, speed optimization (Core Web Vitals), and advanced security code auditing, Vinay engineers production-ready systems designed to scale under heavy concurrent load conditions.



Chat on WhatsApp

Recent Posts

  • Go Goroutines vs. Node.js Event Loop: Scaling I/O-Bound Microservices Under High Load
  • Elixir Phoenix vs. Go Gin: Concurrency Models and Fault Tolerance Under Peak Request Volume
  • Python Celery vs. Go Channels: Distributed Task Queue Overhead and Memory Reliability
  • Scala Pekko vs. Go Goroutines: Actor Model vs. CSP for Event-Driven Reactive Systems
  • Java Loom Virtual Threads vs. Go Goroutines: Under-the-Hood Scheduler and Thread Overhead Comparison

Categories

  • apache (1)
  • Business & Monetization (390)
  • Centos (4)
  • Comparisons & Decision Making (55)
  • Debian (2)
  • Debugging & Troubleshooting (584)
  • Desktop Applications (14)
  • DevOps (7)
  • DevOps & Cloud Scaling (962)
  • Django (1)
  • Laravel (4)
  • Migration & Architecture (192)
  • Mobile Applications (24)
  • MySQL (1)
  • Performance & Optimization (806)
  • PHP (5)
  • PHP Development (21)
  • Plugins & Themes (244)
  • Programming Languages (9)
  • Python (19)
  • Ruby on Rails (1)
  • Security & Compliance (543)
  • SEO & Growth (491)
  • Server (23)
  • Ubuntu (9)
  • VB6 & VB.NET (8)
  • Web Applications & Frontend (19)
  • Web Assembly (Wasm) (2)
  • WordPress (22)
  • WordPress Plugin Development (7)
  • WordPress Theme Development (357)

Recent Posts

  • Go Goroutines vs. Node.js Event Loop: Scaling I/O-Bound Microservices Under High Load
  • Elixir Phoenix vs. Go Gin: Concurrency Models and Fault Tolerance Under Peak Request Volume
  • Python Celery vs. Go Channels: Distributed Task Queue Overhead and Memory Reliability

Top Categories

  • DevOps & Cloud Scaling (962)
  • Performance & Optimization (806)
  • Debugging & Troubleshooting (584)
  • Security & Compliance (543)
  • SEO & Growth (491)
  • Business & Monetization (390)

Our Products

  • ERP & LMS Systems (4)
  • Directories & Marketplaces (4)
  • Healthcare Portals (3)
  • Point of Sale (POS) (2)
  • E-Commerce Engines (2)

Our Services

  • E-Commerce Development (10)
  • WordPress Development (8)
  • Python & Desktop GUI (7)
  • General Consulting (7)
  • Legacy Modernization (5)
  • Mobile App Development (4)

Copyright © 2026 · Vinay Vengala