NGINX Sysguard: Automatic Protection Against Server Overload

Danila Vershinin

5 months ago

NGINX Sysguard: Automatic Protection Against Server Overload

📅 Updated: June 7, 2026 (Originally published: February 7, 2026)

When your server hits its limits, bad things happen. Response times spike, connections queue up, and eventually the entire system grinds to a halt. The typical scenario: a traffic spike or a misbehaving backend pushes CPU load through the roof, memory fills up, and NGINX keeps accepting requests it cannot serve.

The NGINX sysguard module solves this with automatic overload protection. It monitors system load, memory usage, and request response times in real time. When any metric crosses your defined threshold, sysguard rejects new requests gracefully — returning a 503 status or redirecting to a custom page — instead of letting the server collapse under pressure.

This is not rate limiting, which restricts requests per client. The NGINX sysguard module protects the server itself by watching system-level metrics. Think of it as a circuit breaker for your web server.

How NGINX Sysguard Works

The sysguard module operates in NGINX’s preaccess phase. Before each request reaches your content handler, it checks the current system state against your configured thresholds:

CPU load average — read via getloadavg() or sysinfo() system calls
Memory usage — parsed from /proc/meminfo (swap ratio and free memory)
Request response time — calculated from recent request processing durations

If any threshold is exceeded (in default or mode), the request is immediately redirected to a fallback action — typically a 503 response. In and mode, all configured thresholds must be exceeded before protection triggers.

To minimize overhead, sysguard caches system metrics for a configurable interval (default: 1 second). This means it does not call getloadavg() or read /proc/meminfo on every single request.

Installation

RHEL, CentOS, Rocky Linux, AlmaLinux, Fedora, Amazon Linux

Install the NGINX sysguard module from the GetPageSpeed extras repository:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-sysguard

Then load the module in /etc/nginx/nginx.conf:

load_module modules/ngx_http_sysguard_module.so;

Debian, Ubuntu

Install from the GetPageSpeed APT repository:

sudo apt-get update
sudo apt-get install nginx-module-sysguard

On Debian and Ubuntu, the module is enabled automatically upon installation — no load_module directive is needed.

Configuration Reference

`sysguard`

Enables or disables the module.

Syntax: sysguard on | off;
Default: off
Context: http, server, location

server {
    sysguard on;
}

`sysguard_load`

Sets the CPU load average threshold. When the 1-minute load average exceeds this value, sysguard triggers the specified action.

Syntax: sysguard_load load=<number> [action=<uri>];
Default: not set
Context: http, server, location

The load parameter accepts a decimal number or the special ncpu multiplier:

# Fixed threshold
sysguard_load load=10.5 action=/overloaded;

# CPU-aware threshold (scales with core count)
sysguard_load load=ncpu*1.5 action=/overloaded;

The ncpu multiplier is essential for portable configurations. On a 4-core server, ncpu*1.5 translates to a threshold of 6.0. On an 8-core server, it becomes 12.0. This means you can deploy the same configuration across servers with different CPU counts.

If no action is specified, sysguard returns HTTP 503 directly.

`sysguard_mem`

Monitors memory usage. You can set thresholds for swap usage ratio, free memory, or both.

Syntax: sysguard_mem swapratio=<percent>% [action=<uri>];
Syntax: sysguard_mem free=<size> [action=<uri>];
Default: not set
Context: http, server, location

# Trigger when swap usage exceeds 20%
sysguard_mem swapratio=20% action=/low-memory;

# Trigger when free memory drops below 100MB
sysguard_mem free=100M action=/low-memory;

You can use both directives together. Free memory is calculated as MemFree + Buffers + Cached from /proc/meminfo. This represents memory that the kernel can reclaim when needed — a more accurate measure than raw MemFree alone.

`sysguard_rt`

Monitors the average request processing time. This directive protects against backend slowdowns that have not yet affected CPU load or memory.

Syntax: sysguard_rt rt=<seconds> period=<time> [method=AMM|WMA[:<number>]] [action=<uri>];
Default: not set
Context: http, server, location

# Trigger when average response time exceeds 500ms over 10 seconds
sysguard_rt rt=0.5 period=10s method=AMM:20 action=/slow-response;

Two averaging methods are available:

AMM (Arithmetic Mean Method) — simple average of all requests within the time window. Best for stable traffic patterns.
WMA (Weighted Moving Average) — gives more weight to recent requests. Better for detecting sudden slowdowns because older samples have less influence.

The optional number after the method (e.g., AMM:20) sets the sample buffer size. A larger buffer provides smoother averages but responds slower to changes.

`sysguard_mode`

Controls how multiple thresholds interact.

Syntax: sysguard_mode or | and;
Default: or
Context: http, server, location

# OR mode (default): trigger if ANY threshold is exceeded
sysguard_mode or;

# AND mode: trigger only if ALL thresholds are exceeded
sysguard_mode and;

Use or mode for maximum protection — any single metric breaching its limit triggers the overload response. Use and mode to avoid false positives when you want multiple conditions confirmed before rejecting traffic.

`sysguard_interval`

Sets how often sysguard refreshes its cached system metrics.

Syntax: sysguard_interval <time>;
Default: 1s
Context: http, server, location

sysguard_interval 5s;

A longer interval reduces system call overhead but makes the module less responsive to rapid changes. For most production servers, 1–5 seconds is appropriate.

`sysguard_log_level`

Controls the log severity level when protection triggers.

Syntax: sysguard_log_level info | notice | warn | error;
Default: error
Context: http, server, location

sysguard_log_level warn;

When sysguard rejects a request, it logs a message like:

sysguard load limited, current:2.450 conf:1.500

Set this to warn in production to keep overload events visible without flooding your error log.

Practical Configuration Examples

Basic NGINX Sysguard Setup

The simplest useful configuration protects against CPU overload:

server {
    listen 80;
    server_name example.com;

    sysguard on;
    sysguard_load load=ncpu*1.5 action=@guard;
    sysguard_interval 5s;

    location / {
        proxy_pass http://backend;
    }

    location @guard {
        add_header Retry-After 30 always;
        return 503 "Service temporarily unavailable. Please retry after 30 seconds.";
    }
}

The Retry-After header tells well-behaved clients (and search engine crawlers) when to try again. This prevents them from hammering your server during an overload event.

Comprehensive Production Setup

For production servers, monitor all three metrics and use a custom log format to track system health:

log_format sysguard_log '$remote_addr [$time_local] "$request" $status '
    'load=$sysguard_load free=$sysguard_free '
    'swap=$sysguard_swapstat rt=$sysguard_rt';

server {
    listen 80;
    server_name example.com;

    access_log /var/log/nginx/sysguard.log sysguard_log;

    sysguard on;
    sysguard_mode or;
    sysguard_load load=ncpu*1.5 action=@overloaded;
    sysguard_mem swapratio=30% action=@overloaded;
    sysguard_mem free=100M action=@overloaded;
    sysguard_rt rt=2.0 period=30s method=WMA:50 action=@overloaded;
    sysguard_interval 5s;
    sysguard_log_level warn;

    location / {
        proxy_pass http://backend;
    }

    location @overloaded {
        add_header Retry-After 30 always;
        return 503 "Service temporarily unavailable. Please retry after 30 seconds.";
    }
}

This NGINX sysguard configuration triggers overload protection when any of these conditions occurs:

CPU load exceeds 1.5x the number of cores
Swap usage exceeds 30%
Free memory drops below 100 MB
Average response time exceeds 2 seconds (weighted moving average over 30s)

Per-Location Thresholds

Different parts of your application may have different tolerances. API endpoints typically need stricter protection than static content:

server {
    listen 80;
    server_name example.com;

    # Global: generous threshold for static content
    sysguard on;
    sysguard_load load=ncpu*2 action=@guard;

    location / {
        root /var/www/html;
        index index.html;
    }

    # Stricter thresholds for API endpoints
    location /api/ {
        sysguard on;
        sysguard_load load=ncpu*1.0 action=@guard;
        sysguard_rt rt=1.0 period=15s method=AMM action=@guard;
        proxy_pass http://api_backend;
    }

    location @guard {
        default_type application/json;
        return 503 '{"error": "service_unavailable", "retry_after": 30}';
    }
}

This protects your API at a lower load threshold (1x core count) than the rest of the site (2x core count). Additionally, it adds response time monitoring specifically for API requests.

AND Mode: Avoiding False Positives

Sometimes a single metric can spike temporarily without meaning the server is truly overloaded. AND mode requires all conditions to be met:

server {
    listen 80;
    server_name example.com;

    sysguard on;
    sysguard_mode and;
    sysguard_load load=ncpu*2 action=@guard;
    sysguard_mem free=50M action=@guard;
    sysguard_interval 2s;

    location / {
        proxy_pass http://backend;
    }

    location @guard {
        return 503;
    }
}

Protection only triggers when both load exceeds 2x core count and free memory drops below 50 MB. A temporary load spike alone will not reject requests.

Exported Variables

The sysguard module exports variables you can use in logging, headers, and conditional logic. These variables are only populated when their corresponding directives are configured.

Variable	Description	Units	Requires
`$sysguard_load`	Current 1-minute load average	Thousandths (219 = 0.219)	`sysguard_load`
`$sysguard_free`	Available memory (Free + Buffers + Cached)	Bytes	`sysguard_mem`
`$sysguard_swapstat`	Swap usage percentage	0–100	`sysguard_mem`
`$sysguard_rt`	Average request processing time	Milliseconds	`sysguard_rt`
`$sysguard_meminfo_totalram`	Total physical memory	Bytes	`sysguard_mem`
`$sysguard_meminfo_freeram`	Free physical memory	Bytes	`sysguard_mem`
`$sysguard_meminfo_bufferram`	Buffer memory	Bytes	`sysguard_mem`
`$sysguard_meminfo_cachedram`	Cached memory	Bytes	`sysguard_mem`
`$sysguard_meminfo_totalswap`	Total swap space	Bytes	`sysguard_mem`
`$sysguard_meminfo_freeswap`	Free swap space	Bytes	`sysguard_mem`

Important: The $sysguard_load value is in thousandths. A value of 1500 means a load average of 1.5. Memory variables report values in bytes.

Health Monitoring Endpoint

Use the exported variables to build a lightweight health check endpoint for your NGINX sysguard setup:

server {
    listen 80;
    server_name example.com;

    sysguard on;
    sysguard_load load=ncpu*1.5 action=@guard;
    sysguard_mem free=100M action=@guard;
    sysguard_interval 5s;

    location / {
        proxy_pass http://backend;
    }

    location /health {
        default_type application/json;
        return 200 '{"load": "$sysguard_load", "free_memory": "$sysguard_free", "swap": "$sysguard_swapstat"}';
    }

    location @guard {
        return 503;
    }
}

This /health endpoint can be polled by load balancers or monitoring tools like the NGINX VTS module to observe system metrics without additional monitoring agents.

Custom Log Format for Monitoring

Track sysguard metrics alongside standard access logs:

log_format sysguard_log '$remote_addr - $status '
    'load=$sysguard_load free=$sysguard_free '
    'swap=$sysguard_swapstat rt=$sysguard_rt';

server {
    listen 80;
    server_name example.com;

    access_log /var/log/nginx/sysguard.log sysguard_log;

    sysguard on;
    sysguard_load load=ncpu*1.5 action=@guard;
    sysguard_mem free=100M action=@guard;
    sysguard_interval 5s;

    location / {
        proxy_pass http://backend;
    }

    location @guard {
        return 503;
    }
}

Sample log output:

192.168.1.100 - 200 load=219 free=2402320384 swap=0 rt=0
192.168.1.101 - 503 load=4500 free=52428800 swap=45 rt=1250

In this example, the second request was rejected because the load (4.5) exceeded the threshold on a 2-core machine (ncpu*1.5 = 3.0).

NGINX Sysguard vs. Rate Limiting

The sysguard module and rate limiting serve different purposes and work best together:

Feature	Sysguard	Rate Limiting
Protects	The server itself	Per-client fairness
Monitors	CPU, memory, response time	Request count per client
Triggers	System-level thresholds	Per-IP request rate
Use case	Prevent server collapse	Prevent abuse from individual clients

Use rate limiting to prevent individual clients from consuming too many resources. Use the sysguard module as a safety net that catches any overload condition — regardless of its source.

Performance Considerations

The sysguard module adds minimal overhead to request processing:

System calls are cached. The getloadavg() and /proc/meminfo reads happen once per sysguard_interval, not per request. With the default 1-second interval and 10,000 requests per second, only 1 in 10,000 requests triggers a system call.
Decision logic is simple. The preaccess handler performs integer comparisons against cached values — negligible CPU cost.
Response time tracking uses a ring buffer. The circular buffer for RT calculations has O(1) insert and fixed memory usage regardless of traffic volume.

For high-traffic servers, increase sysguard_interval to 5s or 10s to further reduce system call frequency. The trade-off is slower response to sudden load changes.

Troubleshooting

Module Does Not Trigger

Verify the module is loaded:

nginx -V 2>&1 | grep sysguard
# Or check loaded modules:
nginx -T 2>&1 | grep sysguard

Confirm sysguard on is set in the correct context (server or location).
Check the configured threshold against actual system load:

# Current load average
uptime

# Memory status
free -m

Decrease sysguard_interval to ensure metrics are fresh.

Memory Variables Show Zero

Memory-related variables ($sysguard_free, $sysguard_swapstat, etc.) are only populated when sysguard_mem is configured. If you only use sysguard_load, the memory variables will return empty or zero values.

Similarly, $sysguard_load requires sysguard_load to be configured, and $sysguard_rt requires sysguard_rt.

Understanding the Load Value

The $sysguard_load variable reports load in thousandths. To convert to the familiar decimal format:

$sysguard_load = 1500 → actual load = 1.5
$sysguard_load = 219 → actual load = 0.219

The error log, however, uses decimal format:

sysguard load limited, current:2.450 conf:1.500

Overload Protection Triggers Too Often

If sysguard is rejecting too many requests:

Increase the load threshold. A value of ncpu*1.5 is a good starting point, but some workloads tolerate higher load.
Switch from or to and mode so that multiple conditions must be met.
Increase sysguard_interval to smooth out transient spikes.
Consider using WMA instead of AMM for response time — it adapts faster to recovery.

Security Best Practices

The NGINX sysguard module is a defense-in-depth layer. Combine it with other security measures for comprehensive server protection:

Rate limiting prevents individual clients from causing overload
ModSecurity WAF blocks malicious requests before they consume resources
Connection limits (limit_conn) cap concurrent connections per client
Caching reduces backend load by serving cached responses

The sysguard module acts as the last line of defense — when everything else fails to prevent overload, it ensures the server degrades gracefully rather than crashing.

Conclusion

The NGINX sysguard module provides automatic, system-aware overload protection that keeps your server responsive under pressure. Instead of letting traffic spikes bring down your entire application, sysguard rejects excess requests gracefully with proper HTTP 503 responses and Retry-After headers.

Key takeaways for configuring nginx sysguard:

Use ncpu* multipliers for portable load thresholds across different servers
Monitor all three metrics (load, memory, response time) for comprehensive protection
Start with or mode and tune thresholds before switching to and mode
Use the exported variables in log formats and health endpoints for observability
Combine sysguard with rate limiting and caching for defense in depth

The module is available from the GetPageSpeed repository for RHEL-based distributions and from the APT repository for Debian/Ubuntu. The source code is on GitHub.

Automatic overload protection. Sysguard sheds load when the box is hot today; the day someone raises the loadavg threshold without testing is the day legitimate traffic gets 503ed at noon. GetPageSpeed Amplify runs scheduled gixy scans across every host and ties findings to live NGINX runtime metrics. Drop-in compatible with the deprecated nginx-amplify-agent (EOL January 2026).