Skip to main content
If your Next.js application struggles under load, crashes with 200+ concurrent users, or shows uneven CPU usage across PM2 instances, you’re likely experiencing event loop blocking. This guide explains what event loops are, why they matter, and how to implement monitoring to diagnose and fix performance bottlenecks.

Table of Contents

  1. Understanding Event Loops
  2. Why Event Loops Matter for Next.js
  3. Implementing Event Loop Monitoring
  4. Interpreting Monitoring Data
  5. Common Blocking Patterns and Fixes
  6. Advanced Monitoring Setup

Understanding Event Loops

What is an Event Loop?

Node.js (and by extension, Next.js) runs on a single thread. Yet it can handle thousands of concurrent connections efficiently. The event loop makes this possible. Think of a restaurant kitchen with one chef. Instead of preparing one complete order before starting the next, the chef:
  1. Puts burger #1 on the grill
  2. While it cooks, starts burger #2
  3. While both cook, prepares fries
  4. Checks which items are done
  5. Serves completed orders
  6. Repeats
This is how Node.js works. The event loop continuously cycles through tasks, checking what’s ready to execute next.

The Event Loop Cycle

Key Point: The loop waits if a callback takes too long.

Non-Blocking vs Blocking Code

Non-Blocking (Good):
// Request 1
app.get('/api/users', async (req, res) => {
  const users = await db.query('SELECT * FROM users');
  res.json(users);
});

// Timeline:
// 0ms:  Request 1 starts → db.query (async) → Event loop free
// 1ms:  Request 2 arrives → handled immediately ✓
// 2ms:  Request 3 arrives → handled immediately ✓
// 10ms: Request 1 db done → response sent ✓
Blocking (Bad):
// Request 1
app.get('/api/users', (req, res) => {
  let result = [];
  for (let i = 0; i < 10000000; i++) {
    result.push(heavyCalculation(i));
  }
  res.json(result);
});

// Timeline:
// 0ms:   Request 1 starts → heavy loop → Event loop BLOCKED
// 100ms: Request 2 arrives → WAITING ⏳
// 200ms: Request 3 arrives → WAITING ⏳
// 500ms: Request 1 finally done → Request 2 can now start
The blocking code prevents the event loop from processing other requests, creating a bottleneck even with available CPU capacity. The symptom will be a degraded experience for users while your resources are still not used 100%.

Why Event Loops Matter for Next.js

The PM2 Cluster Scenario

When running Next.js with PM2 in cluster mode, you typically have multiple worker processes: If one worker’s event loop is blocked by expensive synchronous operations, that worker can’t handle new requests. PM2 continues sending it requests via round-robin distribution, but they queue up, causing timeouts and poor performance.

Symptoms of Event Loop Blocking

  • One PM2 instance at 100% CPU, others underutilized
  • Requests timing out despite available server resources
  • Uneven request distribution across workers
  • Application crashes under moderate load (200+ concurrent users)
  • Response times increase dramatically under load

Implementing Event Loop Monitoring

Step 1: Create the Monitor

Create the monitoring module using Node.js’s built-in perf_hooks API to track event loop delay with high precision. Create lib/monitoring/advancedEventLoopMonitor.js:
const { performance, PerformanceObserver } = require('perf_hooks');
const { monitorEventLoopDelay } = require('perf_hooks');

class AdvancedEventLoopMonitor {
  constructor(options = {}) {
    this.resolution = options.resolution || 10;
    this.warningThreshold = options.warningThreshold || 50;
    this.criticalThreshold = options.criticalThreshold || 100;
    this.logInterval = options.logInterval || 30000;
    
    this.histogram = monitorEventLoopDelay({ resolution: this.resolution });
    this.histogram.enable();
    
    this.startTime = Date.now();
    this.requestCount = 0;
    this.slowRequests = [];
    
    this.startLogging();
    this.setupProcessMetrics();
    
    console.log('🔬 Advanced Event Loop Monitor initialized');
  }

  startLogging() {
    this.logIntervalId = setInterval(() => {
      this.logDetailedStats();
    }, this.logInterval);
  }

  setupProcessMetrics() {
    try {
      if (global.gc) {
        const gcStats = { count: 0, totalDuration: 0 };
        
        const obs = new PerformanceObserver((list) => {
          const entries = list.getEntries();
          entries.forEach((entry) => {
            if (entry.entryType === 'gc') {
              gcStats.count++;
              gcStats.totalDuration += entry.duration;
              
              if (entry.duration > 100) {
                console.warn(`⚠️  Long GC pause: ${entry.duration.toFixed(2)}ms`);
              }
            }
          });
        });
        
        obs.observe({ entryTypes: ['gc'] });
        this.gcStats = gcStats;
      }
    } catch (e) {
      console.log('GC monitoring not available (run with --expose-gc flag)');
    }
  }

  logDetailedStats() {
    const stats = this.getDetailedStats();
    
    const status = stats.p99 > this.criticalThreshold ? '🔴 CRITICAL' :
                   stats.p95 > this.warningThreshold ? '🟡 WARNING' :
                   '🟢 HEALTHY';
    
    console.log('\n' + '='.repeat(60));
    console.log(`${status} Event Loop Health Report`);
    console.log('='.repeat(60));
    console.log(`Instance: ${process.env.INSTANCE_ID || process.pid}`);
    console.log(`Uptime: ${this.formatUptime(Date.now() - this.startTime)}`);
    console.log(`\nEvent Loop Delay (ms):`);
    console.log(`  Min:    ${stats.min.toFixed(2)}ms`);
    console.log(`  Mean:   ${stats.mean.toFixed(2)}ms`);
    console.log(`  Max:    ${stats.max.toFixed(2)}ms`);
    console.log(`  P50:    ${stats.p50.toFixed(2)}ms`);
    console.log(`  P95:    ${stats.p95.toFixed(2)}ms`);
    console.log(`  P99:    ${stats.p99.toFixed(2)}ms`);
    console.log(`  StdDev: ${stats.stddev.toFixed(2)}ms`);
    
    console.log(`\nMemory:`);
    const mem = process.memoryUsage();
    console.log(`  RSS:        ${(mem.rss / 1024 / 1024).toFixed(2)} MB`);
    console.log(`  Heap Used:  ${(mem.heapUsed / 1024 / 1024).toFixed(2)} MB`);
    console.log(`  Heap Total: ${(mem.heapTotal / 1024 / 1024).toFixed(2)} MB`);
    console.log(`  External:   ${(mem.external / 1024 / 1024).toFixed(2)} MB`);
    
    if (this.gcStats) {
      console.log(`\nGarbage Collection:`);
      console.log(`  Count:        ${this.gcStats.count}`);
      console.log(`  Total Time:   ${this.gcStats.totalDuration.toFixed(2)}ms`);
      console.log(`  Avg Per GC:   ${(this.gcStats.totalDuration / this.gcStats.count || 0).toFixed(2)}ms`);
    }
    
    console.log(`\nRequests Processed: ${this.requestCount}`);
    
    if (this.slowRequests.length > 0) {
      console.log(`\n⚠️  Slow Requests (last ${this.slowRequests.length}):`);
      this.slowRequests.slice(-5).forEach(req => {
        console.log(`  ${req.method} ${req.url} - ${req.duration.toFixed(2)}ms - ${req.timestamp}`);
      });
    }
    
    console.log('='.repeat(60) + '\n');
    
    this.histogram.reset();
    
    if (this.slowRequests.length > 100) {
      this.slowRequests = this.slowRequests.slice(-50);
    }
  }

  getDetailedStats() {
    return {
      min: this.histogram.min / 1e6,
      max: this.histogram.max / 1e6,
      mean: this.histogram.mean / 1e6,
      stddev: this.histogram.stddev / 1e6,
      p50: this.histogram.percentile(50) / 1e6,
      p95: this.histogram.percentile(95) / 1e6,
      p99: this.histogram.percentile(99) / 1e6,
      p999: this.histogram.percentile(99.9) / 1e6
    };
  }

  formatUptime(ms) {
    const seconds = Math.floor(ms / 1000);
    const minutes = Math.floor(seconds / 60);
    const hours = Math.floor(minutes / 60);
    const days = Math.floor(hours / 24);
    
    if (days > 0) return `${days}d ${hours % 24}h`;
    if (hours > 0) return `${hours}h ${minutes % 60}m`;
    if (minutes > 0) return `${minutes}m ${seconds % 60}s`;
    return `${seconds}s`;
  }

  trackRequest(method, url, duration) {
    this.requestCount++;
    
    if (duration > 1000) {
      this.slowRequests.push({
        method,
        url,
        duration,
        timestamp: new Date().toISOString()
      });
    }
  }

  getStats() {
    return this.getDetailedStats();
  }

  stop() {
    if (this.logIntervalId) {
      clearInterval(this.logIntervalId);
    }
    this.histogram.disable();
    console.log('🔬 Event Loop Monitor stopped');
  }
}

module.exports = AdvancedEventLoopMonitor;

Step 2: Initialize on Server Start

Next.js 13+ uses the instrumentation hook for server initialization. Create instrumentation.js in your project root:
export async function register() {
  if (process.env.NEXT_RUNTIME === 'nodejs') {
    const AdvancedEventLoopMonitor = require('./lib/monitoring/advancedEventLoopMonitor');
    
    global.eventLoopMonitor = new AdvancedEventLoopMonitor({
      resolution: 10,
      warningThreshold: 50,
      criticalThreshold: 100,
      logInterval: 30000
    });
    
    console.log('✅ Event loop monitoring initialized');
  }
}
Enable instrumentation in next.config.js:
/** @type {import('next').NextConfig} */
const nextConfig = {
  experimental: {
    instrumentationHook: true,
  },
};

module.exports = nextConfig;

Step 3: Create Health Check Endpoint

Create app/api/monitoring/health/route.js:
export async function GET(request) {
  try {
    const monitor = global.eventLoopMonitor;
    
    if (!monitor) {
      return Response.json(
        { error: 'Monitor not initialized' },
        { status: 503 }
      );
    }

    const stats = monitor.getStats();
    const memory = process.memoryUsage();
    
    const health = {
      status: stats.p99 > 100 ? 'critical' : 
              stats.p95 > 50 ? 'warning' : 'healthy',
      instance: process.env.INSTANCE_ID || process.pid,
      uptime: process.uptime(),
      eventLoop: {
        min: parseFloat(stats.min.toFixed(2)),
        mean: parseFloat(stats.mean.toFixed(2)),
        max: parseFloat(stats.max.toFixed(2)),
        p50: parseFloat(stats.p50.toFixed(2)),
        p95: parseFloat(stats.p95.toFixed(2)),
        p99: parseFloat(stats.p99.toFixed(2)),
      },
      memory: {
        rss: Math.round(memory.rss / 1024 / 1024),
        heapUsed: Math.round(memory.heapUsed / 1024 / 1024),
        heapTotal: Math.round(memory.heapTotal / 1024 / 1024),
        external: Math.round(memory.external / 1024 / 1024),
      },
      timestamp: new Date().toISOString()
    };

    return Response.json(health);
  } catch (error) {
    return Response.json(
      { error: error.message },
      { status: 500 }
    );
  }
}

Step 4: Create Request Tracking Middleware

Create middleware.js in your project root:
import { NextResponse } from 'next/server';

export function middleware(request) {
  const start = Date.now();
  
  const response = NextResponse.next();
  
  response.headers.set('X-Request-Start', start.toString());
  
  return response;
}

export const config = {
  matcher: '/api/:path*',
};
Create app/api/[...route]/route.js wrapper to track completion:
export async function GET(request) {
  const start = Date.now();
  
  try {
    // Your API logic here
    const response = await yourApiHandler(request);
    
    const duration = Date.now() - start;
    
    if (global.eventLoopMonitor) {
      global.eventLoopMonitor.trackRequest(
        'GET',
        request.url,
        duration
      );
    }
    
    return response;
  } catch (error) {
    const duration = Date.now() - start;
    
    if (global.eventLoopMonitor) {
      global.eventLoopMonitor.trackRequest(
        'GET',
        request.url,
        duration
      );
    }
    
    throw error;
  }
}

Interpreting Monitoring Data

Understanding the Metrics

The monitor tracks several key metrics: Event Loop Delay Percentiles:
  • P50 (Median): Half of all event loop iterations complete faster than this value
  • P95: 95% of iterations complete faster than this value
  • P99: 99% of iterations complete faster than this value
Target Values:
  • P50: < 10ms (excellent), 10-25ms (good), > 25ms (investigate)
  • P95: < 50ms (excellent), 50-100ms (acceptable), > 100ms (warning)
  • P99: < 100ms (excellent), 100-250ms (warning), > 250ms (critical)

Reading the Health Report

🟢 HEALTHY Event Loop Health Report
============================================================
Instance: worker-1 (PID: 12345)
Uptime: 2h 34m

Event Loop Delay (ms):
  Min:    0.05ms     ← Best case scenario
  Mean:   8.23ms     ← Average delay (good)
  Max:    156.42ms   ← Worst case (occasional spikes OK)
  P50:    4.12ms     ← 50% of loops finish this fast
  P95:    32.45ms    ← 95% of loops finish this fast
  P99:    78.91ms    ← 99% of loops finish this fast ✓
  StdDev: 12.34ms    ← Consistency (lower is better)
Health Status Interpretation: 🟢 HEALTHY: P99 < 100ms
  • Application responding well
  • Event loop processing efficiently
  • No immediate action needed
🟡 WARNING: P95 > 50ms or P99 100-250ms
  • Event loop experiencing delays
  • Investigate recent code changes
  • Review slow requests log
  • Consider optimization
🔴 CRITICAL: P99 > 250ms
  • Event loop heavily blocked
  • User experience degraded
  • Immediate action required
  • Check for CPU-intensive operations

Real-World Example Analysis

Good Performance:
Event Loop Delay (ms):
  P50:    3.21ms
  P95:    18.45ms
  P99:    42.33ms
This shows consistent, fast event loop processing. The application handles load well. Warning Signs:
Event Loop Delay (ms):
  P50:    12.45ms
  P95:    156.78ms
  P99:    342.11ms
High variance between P50 and P99 indicates sporadic blocking operations. Investigate slow requests. Critical Issues:
Event Loop Delay (ms):
  P50:    45.23ms
  P95:    523.45ms
  P99:    1234.56ms
Consistently high delays across all percentiles indicate systemic blocking issues. Check for synchronous database operations or heavy computation.

Common Blocking Patterns and Fixes

1. Large JSON Parsing

❌ Blocking:
export default function handler(req, res) {
  const data = JSON.parse(largeJsonString); // Blocks event loop
  res.json(data);
}
✅ Non-Blocking:
import { Worker } from 'worker_threads';

export default async function handler(req, res) {
  const worker = new Worker('./workers/json-parser.js');
  
  worker.postMessage(largeJsonString);
  
  const data = await new Promise((resolve, reject) => {
    worker.on('message', resolve);
    worker.on('error', reject);
  });
  
  res.json(data);
}

2. Synchronous File Operations

❌ Blocking:
import fs from 'fs';

export default function handler(req, res) {
  const data = fs.readFileSync('./large-file.json', 'utf8');
  res.send(data);
}
✅ Non-Blocking:
import { readFile } from 'fs/promises';

export default async function handler(req, res) {
  const data = await readFile('./large-file.json', 'utf8');
  res.send(data);
}

3. Complex Array Operations

❌ Blocking:
export default function handler(req, res) {
  const results = largeArray.map(item => {
    return expensiveOperation(item);
  });
  res.json(results);
}
✅ Non-Blocking:
export default async function handler(req, res) {
  const results = await Promise.all(
    largeArray.map(async item => {
      return await expensiveOperationAsync(item);
    })
  );
  res.json(results);
}
Or batch process:
export default async function handler(req, res) {
  const BATCH_SIZE = 100;
  const results = [];
  
  for (let i = 0; i < largeArray.length; i += BATCH_SIZE) {
    const batch = largeArray.slice(i, i + BATCH_SIZE);
    const batchResults = await Promise.all(
      batch.map(item => expensiveOperationAsync(item))
    );
    results.push(...batchResults);
    
    // Allow event loop to process other requests
    await new Promise(resolve => setImmediate(resolve));
  }
  
  res.json(results);
}

4. Database Queries Without Connection Pooling

❌ Blocking:
// Creating new connection each time
export default async function handler(req, res) {
  const client = await createConnection();
  const result = await client.query('SELECT * FROM users');
  await client.close();
  res.json(result);
}
✅ Non-Blocking:
// Use connection pool
import { pool } from '@/lib/db';

export default async function handler(req, res) {
  const result = await pool.query('SELECT * FROM users');
  res.json(result);
}

Advanced Monitoring Setup

Real-Time Dashboard Script

Create scripts/monitor-dashboard.sh. Don’t forget to adapt the pm2 application name. You will also need the jq utility on your system.
#!/bin/bash

while true; do
  clear
  echo "=== Next.js Event Loop Monitoring ==="
  echo "Updated: $(date)"
  echo ""
  
  # PM2 Status
  echo "PM2 Instances:"
  pm2 list | grep next-app
  echo ""
  
  # Health Check All Instances
  echo "Health Status:"
  for i in {1..10}; do
    curl -s http://localhost:3000/api/monitoring/health 2>/dev/null | \
      jq -r '"\(.instance): P99=\(.eventLoop.p99)ms \(.status)"' || \
      echo "Instance not responding"
  done
  
  echo ""
  echo "Press Ctrl+C to exit"
  sleep 5
done
Make it executable:
chmod +x scripts/monitor-dashboard.sh
./scripts/monitor-dashboard.sh

Load Test Monitoring

Create scripts/load-test-monitor.sh:
#!/bin/bash

OUTPUT_DIR="./load-test-results/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$OUTPUT_DIR"

echo "timestamp,instance,p95,p99,heap_mb" > "$OUTPUT_DIR/metrics.csv"

echo "Starting load test monitoring..."
echo "Press Ctrl+C when test is complete"

while true; do
  timestamp=$(date +%s)
  
  # Collect metrics from health endpoint
  curl -s http://localhost:3000/api/monitoring/health 2>/dev/null | \
    jq -r --arg ts "$timestamp" \
    '[$ts, .instance, .eventLoop.p95, .eventLoop.p99, .memory.heapUsed] | @csv' \
    >> "$OUTPUT_DIR/metrics.csv"
  
  sleep 2
done
Run alongside your load test:
# Terminal 1: Start monitoring
./scripts/load-test-monitor.sh

# Terminal 2: Run your load test (locust example but use whatever you prefer)
locust -f loadtest.py --host=http://localhost:3000

Analyzing Results

After your load test, analyze the collected data:
# View summary statistics
cat load-test-results/*/metrics.csv | \
  awk -F',' 'NR>1 {sum+=$4; count++; if($4>max) max=$4} 
             END {print "Avg P99:", sum/count, "ms\nMax P99:", max, "ms"}'

# Find instances with high P99
cat load-test-results/*/metrics.csv | \
  awk -F',' 'NR>1 && $4>100 {print $2, $4}' | \
  sort -k2 -rn | \
  head -10

Best Practices

1. Set Appropriate Thresholds

Adjust thresholds based on your application:
const monitor = new AdvancedEventLoopMonitor({
  warningThreshold: 30,   // Stricter for high-performance apps
  criticalThreshold: 75,
  logInterval: 60000      // Less frequent for production
});

2. Monitor in Staging First

Test your monitoring setup in a staging environment before production deployment to:
  • Verify thresholds are appropriate
  • Ensure logging doesn’t impact performance
  • Validate alerting mechanisms

3. Combine with APM Tools

Event loop monitoring complements Application Performance Monitoring tools like Blackfire.io:
  • Use event loop monitoring to identify blocking operations
  • Use APM for distributed tracing and end-to-end monitoring
  • Correlate event loop delays with external service latencies

4. Regular Performance Audits

Schedule monthly performance reviews:
  • Analyze P99 trends over time
  • Identify endpoints with degrading performance
  • Review and optimize slow requests
  • Update monitoring thresholds as needed

Conclusion

Event loop monitoring is essential for building performant Next.js applications at scale. By implementing the monitoring system described in this guide, you can:
  • Identify bottlenecks before they impact users
  • Optimize critical paths with data-driven insights
  • Scale confidently knowing your application’s limits
  • Diagnose issues quickly with detailed metrics
Remember: The event loop is the heartbeat of your Node.js application. Keep it healthy, and your application will scale smoothly.

Additional Resources

Ready to deploy your optimized Next.js application? Create a free Upsun account to get instant preview environments, Git-driven infrastructure, and built-in observability tools for production-ready deployments.
Last modified on April 14, 2026