yum upgrades for production use, this is the repository for you.
Active subscription is required.
The Problem: Rate Limits That Don’t Scale
You have deployed NGINX’s built-in limit_req module to protect your API from abuse. It works — until you scale to multiple NGINX instances behind a load balancer. Suddenly, each server maintains its own separate counters in local shared memory. An attacker sending 100 requests per second can spread them across 5 servers, and each instance sees only 20 — well under your configured limit. Without an NGINX Redis rate limit approach, your rate limiting is effectively bypassed.
This is not a hypothetical scenario. Every multi-server deployment using local-only rate limiting has this blind spot. Additionally, NGINX’s native limit_req does not provide X-RateLimit-* response headers, so API clients have no way to know how close they are to being throttled.
The Solution: NGINX Redis Rate Limit Module
The NGINX Redis rate limit module solves both problems by storing all counters in a centralized Redis instance. Every NGINX server checks the same counters, so an attacker cannot exploit per-instance gaps. The module also implements the Generic Cell Rate Algorithm (GCRA), a sophisticated token bucket variant that provides smoother, fairer request throttling than the leaky bucket approach in native limit_req. Moreover, it includes standard X-RateLimit-* response headers that API clients can use to self-throttle.
When to Choose This Module Over Native limit_req
For comparison with another Redis-backed approach, see our article on the NGINX Dynamic Limit Req Module:
| Feature | Native limit_req |
Redis Rate Limit Module |
|---|---|---|
| State storage | Local shared memory | Centralized Redis |
| Multi-instance | Each instance independent | Shared counters across all instances |
| Algorithm | Leaky bucket | GCRA (smoother, fairer) |
| Rate limit headers | Not built-in | X-RateLimit-* headers included |
| Quota peeking | Not supported | Check remaining quota without consuming |
| Key prefixing | Via zone name | Dynamic prefix per location |
| Persistence | Lost on restart | Survives NGINX restarts |
Use native limit_req when you run a single NGINX instance and need simple, zero-dependency rate limiting.
Use the Redis Rate Limit module when you need distributed rate limiting across multiple servers. Additionally, choose it when you require standard rate limit headers for API clients or want precise burst control.
How the GCRA Algorithm Works
The GCRA treats rate limiting as a token bucket that refills at a constant rate. Unlike simple implementations, it calculates token availability dynamically based on elapsed time. Therefore, no background “drip” process is needed.
Here is how it processes each request:
- The algorithm checks whether enough tokens are available
- If tokens are available, it allows the request and decrements the counter
- If no tokens remain, it rejects the request with a
Retry-Aftertime - Tokens refill automatically based on the configured rate
For example, with requests=10 period=1m burst=5, you get:
- Initial capacity: 6 requests (burst + 1) available immediately
- Sustained rate: 10 requests per minute
- Maximum burst: 5 extra tokens stored during idle periods
- Reported limit: 6 in the
X-RateLimit-Limitheader
A client can make 6 rapid requests, then must wait for tokens to refill at the sustained rate.
Prerequisites
This NGINX Redis rate limit module requires a Redis (or Valkey) server with the redis-rate-limiter extension loaded. This is a lightweight C extension that implements the RATER.LIMIT command.
Installing redis-rate-limiter
Build the Redis module from source:
git clone https://github.com/onsigntv/redis-rate-limiter.git
cd redis-rate-limiter
make
sudo cp ratelimit.so /usr/lib64/
Add it to your Redis configuration (/etc/redis.conf or /etc/valkey/valkey.conf):
loadmodule /usr/lib64/ratelimit.so
Restart Redis and verify:
sudo systemctl restart redis
redis-cli MODULE LIST
You should see rater in the output. Test the command:
redis-cli RATER.LIMIT testkey 5 10 60 1
This returns five integers: status (0=allowed), limit, remaining, retry-after, and reset time.
Installation
RHEL, CentOS, AlmaLinux, Rocky Linux
sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-redis-rate-limit
Load the module in /etc/nginx/nginx.conf at the top level (before the events block):
load_module modules/ngx_http_rate_limit_module.so;
Debian and Ubuntu
First, set up the GetPageSpeed APT repository, then install:
sudo apt-get update
sudo apt-get install nginx-module-redis-rate-limit
On Debian/Ubuntu, the package handles module loading automatically. No
load_moduledirective is needed.
Module package pages:
– RPM: nginx-module-redis-rate-limit
– APT: nginx-module-redis-rate-limit
Basic Configuration
The simplest NGINX Redis rate limit setup requires an upstream block pointing to Redis and the rate_limit directive in a location:
upstream redis {
server 127.0.0.1:6379;
keepalive 1024;
}
server {
listen 80;
server_name example.com;
location /api/ {
rate_limit $remote_addr requests=15 period=1m burst=20;
rate_limit_pass redis;
rate_limit_headers on;
rate_limit_status 429;
proxy_pass http://backend;
}
}
This configuration limits each IP to 15 requests per minute with a burst of 20. It stores state in Redis, returns X-RateLimit-* headers, and responds with HTTP 429 when the limit is exceeded.
Understanding the Response Headers
When rate_limit_headers is enabled, every response includes:
X-RateLimit-Limit: 21
X-RateLimit-Remaining: 20
X-RateLimit-Reset: 4
- X-RateLimit-Limit: Maximum requests allowed (burst + 1)
- X-RateLimit-Remaining: Requests left in the current window
- X-RateLimit-Reset: Seconds until the limit fully resets
When a request is rate-limited (HTTP 429), an additional header appears:
Retry-After: 3
This tells the client exactly how many seconds to wait before retrying.
Directive Reference
All directives work in http, server, and location contexts.
rate_limit
Syntax: rate_limit $key requests=N period=Xm burst=N;
Default: —
Context: http, server, location
The main directive enabling rate limiting. The first argument is the rate limit key — typically $remote_addr for per-IP limiting. Parameters:
requests— Requests allowed per period (default: 1)period— Time window, for example1m,30s,1h(default: 60s)burst— Additional burst capacity (default: 0)
# 100 requests per minute with burst of 50
rate_limit $remote_addr requests=100 period=1m burst=50;
rate_limit_pass
Syntax: rate_limit_pass upstream_name;
Default: —
Context: http, server, location
Specifies the Redis upstream for rate limit state. Accepts a static name or a variable:
rate_limit_pass redis;
rate_limit_pass $redis_upstream;
rate_limit_status
Syntax: rate_limit_status code;
Default: 429
Context: http, server, location
HTTP status code for rate-limited requests. Must be 400–599.
rate_limit_status 503;
rate_limit_headers
Syntax: rate_limit_headers on | off;
Default: off
Context: http, server, location
Enables X-RateLimit-* and Retry-After headers. When off, headers only appear on 429 responses. When on, they appear on every response.
rate_limit_prefix
Syntax: rate_limit_prefix string;
Default: ""
Context: http, server, location
Prepends a prefix to the Redis key, separated by underscore. This creates separate counters for different endpoints sharing the same key variable:
location /api/ {
rate_limit $remote_addr requests=100 period=1m burst=50;
rate_limit_prefix api;
rate_limit_pass redis;
}
location /login {
rate_limit $remote_addr requests=5 period=5m burst=2;
rate_limit_prefix login;
rate_limit_pass redis;
}
Without prefixes, both locations share the same counter per IP. With prefixes, api_192.168.1.1 and login_192.168.1.1 are tracked independently.
rate_limit_quantity
Syntax: rate_limit_quantity number;
Default: 1
Context: http, server, location
Tokens consumed per request. Set to 0 to check status without consuming quota:
location /api/quota {
rate_limit $remote_addr requests=100 period=1m burst=50;
rate_limit_prefix api;
rate_limit_quantity 0;
rate_limit_pass redis;
rate_limit_headers on;
proxy_pass http://backend;
}
rate_limit_log_level
Syntax: rate_limit_log_level info | notice | warn | error;
Default: error
Context: http, server, location
Logging severity when a request is rate-limited. Use warn or notice for high-traffic endpoints to reduce log noise.
rate_limit_connect_timeout
Syntax: rate_limit_connect_timeout time;
Default: 60s
Context: http, server, location
Timeout for establishing a Redis connection.
rate_limit_send_timeout
Syntax: rate_limit_send_timeout time;
Default: 60s
Context: http, server, location
Timeout for sending the request to Redis.
rate_limit_read_timeout
Syntax: rate_limit_read_timeout time;
Default: 60s
Context: http, server, location
Timeout for reading the Redis response.
rate_limit_buffer_size
Syntax: rate_limit_buffer_size size;
Default: 4k
Context: http, server, location
Buffer size for the Redis response. The default is sufficient for normal use.
Real-World Configuration Examples
API Gateway with Tiered Rate Limits
Use NGINX’s map directive to assign different NGINX Redis rate limit keys based on authentication:
upstream redis {
server 127.0.0.1:6379;
keepalive 1024;
}
upstream api_backend {
server 127.0.0.1:8080;
}
map $http_x_api_key $api_rate_key {
default $remote_addr;
"~^premium-" "premium_$http_x_api_key";
"~^basic-" "basic_$http_x_api_key";
}
server {
listen 80;
server_name api.example.com;
rate_limit_status 429;
location /api/v1/public/ {
rate_limit $remote_addr requests=30 period=1m burst=10;
rate_limit_pass redis;
rate_limit_prefix pub;
rate_limit_headers on;
proxy_pass http://api_backend;
}
location /api/v1/ {
rate_limit $api_rate_key requests=60 period=1m burst=30;
rate_limit_pass redis;
rate_limit_prefix api;
rate_limit_headers on;
proxy_pass http://api_backend;
}
}
Login Brute-Force Protection
Apply aggressive limits to authentication endpoints. This is similar to how NGINX TOTP authentication protects sensitive resources:
upstream redis {
server 127.0.0.1:6379;
keepalive 1024;
}
upstream app_backend {
server 127.0.0.1:3000;
}
server {
listen 80;
server_name example.com;
rate_limit_status 429;
location /login {
rate_limit $remote_addr requests=5 period=5m burst=2;
rate_limit_pass redis;
rate_limit_prefix login;
rate_limit_headers on;
rate_limit_log_level warn;
proxy_pass http://app_backend;
}
location /api/auth/token {
rate_limit $remote_addr requests=10 period=10m burst=3;
rate_limit_pass redis;
rate_limit_prefix auth;
rate_limit_headers on;
proxy_pass http://app_backend;
}
location / {
proxy_pass http://app_backend;
}
}
Each IP can attempt only 3 logins immediately (burst + 1), then 5 more over 5 minutes. The Retry-After header tells clients when to try again.
Whitelisting Internal Networks
Use NGINX’s geo and map directives to skip rate limiting for trusted networks:
upstream redis {
server 127.0.0.1:6379;
keepalive 1024;
}
upstream app_backend {
server 127.0.0.1:8080;
}
geo $limit {
default 1;
10.0.0.0/8 0;
172.16.0.0/12 0;
192.168.0.0/16 0;
}
map $limit $limit_key {
0 "";
1 $remote_addr;
}
server {
listen 80;
server_name example.com;
location / {
rate_limit $limit_key requests=30 period=1m burst=15;
rate_limit_pass redis;
rate_limit_headers on;
rate_limit_status 429;
proxy_pass http://app_backend;
}
}
When $limit_key is empty, the rate limit check is skipped. Internal services communicate freely while external traffic remains protected.
Production Timeout Tuning
In high-traffic environments, reduce Redis timeouts to fail fast:
upstream redis {
server 127.0.0.1:6379;
keepalive 1024;
}
server {
listen 80;
server_name example.com;
rate_limit_connect_timeout 100ms;
rate_limit_send_timeout 100ms;
rate_limit_read_timeout 100ms;
location /api/ {
rate_limit $remote_addr requests=60 period=1m burst=30;
rate_limit_pass redis;
rate_limit_headers on;
rate_limit_status 429;
proxy_pass http://api_backend;
}
}
Important Caveats
The return Directive Bypasses Rate Limiting
The NGINX return directive runs during the rewrite phase, before the preaccess phase where rate limiting executes. As a result, return short-circuits the rate limit check entirely:
# WARNING: rate limiting will NOT work here!
location /api {
rate_limit $remote_addr requests=10 period=1m burst=5;
rate_limit_pass redis;
return 200 "OK"; # Bypasses rate limiting!
}
Instead, use proxy_pass or try_files:
# CORRECT: rate limiting works with proxy_pass
location /api {
rate_limit $remote_addr requests=10 period=1m burst=5;
rate_limit_pass redis;
proxy_pass http://backend;
}
SELinux on RHEL-Based Systems
If NGINX cannot connect to Redis, you may see “Permission denied” errors in the NGINX error log. SELinux blocks the connection by default. Allow it with:
sudo setsebool -P httpd_can_network_connect 1
Redis High Availability
If Redis becomes unavailable, the module returns HTTP 503 (Service Temporarily Unavailable). Handle this gracefully with a fallback location:
error_page 502 503 = @fallback;
location @fallback {
# Allow the request through without rate limiting
proxy_pass http://backend;
}
Note that both 502 and 503 should be caught — the module may return either depending on the failure mode. For production, consider Redis Sentinel or Redis Cluster for automatic failover.
Testing Your Configuration
After configuring rate limiting, verify everything works. First, test syntax:
sudo nginx -t
Then reload and send test requests:
sudo systemctl reload nginx
Observe the NGINX Redis rate limit headers by sending a burst:
for i in $(seq 1 10); do
echo "=== Request $i ==="
curl -sI http://localhost/api/ | grep -E "HTTP/|X-RateLimit|Retry"
done
You should see X-RateLimit-Remaining decrease with each request. Eventually, HTTP 429 responses appear with Retry-After.
Inspect Redis keys directly:
redis-cli KEYS "*"
Peek at a key’s state without consuming a token:
redis-cli RATER.LIMIT mykey 10 30 60 0
Performance Considerations
Each rate-limited request adds one Redis round-trip. To minimize latency:
- Use
keepaliveconnections: Thekeepalive 1024directive avoids TCP handshake overhead - Run Redis locally: A local round-trip adds less than 0.1ms
- Tune timeouts: Set connect/send/read timeouts to 100–500ms so Redis outages do not block requests
- Rate limit selectively: Apply limits only to sensitive endpoints (APIs, login pages, forms)
Troubleshooting
“unknown directive rate_limit”
The module is not loaded. Add to nginx.conf:
load_module modules/ngx_http_rate_limit_module.so;
Confirm the file exists:
ls /usr/lib64/nginx/modules/ngx_http_rate_limit_module.so
502 or 503 Errors on Rate-Limited Locations
NGINX cannot reach Redis. Check these items:
- Redis is running:
redis-cli ping→PONG - The redis-rate-limiter module is loaded:
redis-cli MODULE LIST→rater - SELinux allows connections:
setsebool -P httpd_can_network_connect 1 - The upstream address and port are correct
Rate Limiting Not Activating
Requests always pass through without limiting? Check:
- You are not using
returnin the same location - The rate limit key is not empty (empty keys skip the check)
rate_limit_passpoints to a valid upstream
Conclusion
The NGINX Redis rate limit module delivers distributed, centralized rate limiting across multiple NGINX instances. By leveraging the GCRA algorithm, it provides smoother throttling than simple counter approaches. The standard X-RateLimit-* headers help API clients self-throttle proactively.
For single-server setups, NGINX’s built-in limit_req may suffice. However, for multi-server architectures and API gateways, the NGINX Redis rate limit module is the right choice.
Source code: rate-limit-nginx-module on GitHub. Redis extension: redis-rate-limiter on GitHub.
