Skip to main content

NGINX / Security / Server Setup

NGINX Redis Rate Limit Module: Distributed Throttling

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

The Problem: Rate Limits That Don’t Scale

You have deployed NGINX’s built-in limit_req module to protect your API from abuse. It works — until you scale to multiple NGINX instances behind a load balancer. Suddenly, each server maintains its own separate counters in local shared memory. An attacker sending 100 requests per second can spread them across 5 servers, and each instance sees only 20 — well under your configured limit. Without an NGINX Redis rate limit approach, your rate limiting is effectively bypassed.

This is not a hypothetical scenario. Every multi-server deployment using local-only rate limiting has this blind spot. Additionally, NGINX’s native limit_req does not provide X-RateLimit-* response headers, so API clients have no way to know how close they are to being throttled.

The Solution: NGINX Redis Rate Limit Module

The NGINX Redis rate limit module solves both problems by storing all counters in a centralized Redis instance. Every NGINX server checks the same counters, so an attacker cannot exploit per-instance gaps. The module also implements the Generic Cell Rate Algorithm (GCRA), a sophisticated token bucket variant that provides smoother, fairer request throttling than the leaky bucket approach in native limit_req. Moreover, it includes standard X-RateLimit-* response headers that API clients can use to self-throttle.

When to Choose This Module Over Native limit_req

For comparison with another Redis-backed approach, see our article on the NGINX Dynamic Limit Req Module:

Feature Native limit_req Redis Rate Limit Module
State storage Local shared memory Centralized Redis
Multi-instance Each instance independent Shared counters across all instances
Algorithm Leaky bucket GCRA (smoother, fairer)
Rate limit headers Not built-in X-RateLimit-* headers included
Quota peeking Not supported Check remaining quota without consuming
Key prefixing Via zone name Dynamic prefix per location
Persistence Lost on restart Survives NGINX restarts

Use native limit_req when you run a single NGINX instance and need simple, zero-dependency rate limiting.

Use the Redis Rate Limit module when you need distributed rate limiting across multiple servers. Additionally, choose it when you require standard rate limit headers for API clients or want precise burst control.

How the GCRA Algorithm Works

The GCRA treats rate limiting as a token bucket that refills at a constant rate. Unlike simple implementations, it calculates token availability dynamically based on elapsed time. Therefore, no background “drip” process is needed.

Here is how it processes each request:

  1. The algorithm checks whether enough tokens are available
  2. If tokens are available, it allows the request and decrements the counter
  3. If no tokens remain, it rejects the request with a Retry-After time
  4. Tokens refill automatically based on the configured rate

For example, with requests=10 period=1m burst=5, you get:

  • Initial capacity: 6 requests (burst + 1) available immediately
  • Sustained rate: 10 requests per minute
  • Maximum burst: 5 extra tokens stored during idle periods
  • Reported limit: 6 in the X-RateLimit-Limit header

A client can make 6 rapid requests, then must wait for tokens to refill at the sustained rate.

Prerequisites

This NGINX Redis rate limit module requires a Redis (or Valkey) server with the redis-rate-limiter extension loaded. This is a lightweight C extension that implements the RATER.LIMIT command.

Installing redis-rate-limiter

Build the Redis module from source:

git clone https://github.com/onsigntv/redis-rate-limiter.git
cd redis-rate-limiter
make
sudo cp ratelimit.so /usr/lib64/

Add it to your Redis configuration (/etc/redis.conf or /etc/valkey/valkey.conf):

loadmodule /usr/lib64/ratelimit.so

Restart Redis and verify:

sudo systemctl restart redis
redis-cli MODULE LIST

You should see rater in the output. Test the command:

redis-cli RATER.LIMIT testkey 5 10 60 1

This returns five integers: status (0=allowed), limit, remaining, retry-after, and reset time.

Installation

RHEL, CentOS, AlmaLinux, Rocky Linux

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-redis-rate-limit

Load the module in /etc/nginx/nginx.conf at the top level (before the events block):

load_module modules/ngx_http_rate_limit_module.so;

Debian and Ubuntu

First, set up the GetPageSpeed APT repository, then install:

sudo apt-get update
sudo apt-get install nginx-module-redis-rate-limit

On Debian/Ubuntu, the package handles module loading automatically. No load_module directive is needed.

Module package pages:
– RPM: nginx-module-redis-rate-limit
– APT: nginx-module-redis-rate-limit

Basic Configuration

The simplest NGINX Redis rate limit setup requires an upstream block pointing to Redis and the rate_limit directive in a location:

upstream redis {
    server 127.0.0.1:6379;
    keepalive 1024;
}

server {
    listen 80;
    server_name example.com;

    location /api/ {
        rate_limit $remote_addr requests=15 period=1m burst=20;
        rate_limit_pass redis;
        rate_limit_headers on;
        rate_limit_status 429;

        proxy_pass http://backend;
    }
}

This configuration limits each IP to 15 requests per minute with a burst of 20. It stores state in Redis, returns X-RateLimit-* headers, and responds with HTTP 429 when the limit is exceeded.

Understanding the Response Headers

When rate_limit_headers is enabled, every response includes:

X-RateLimit-Limit: 21
X-RateLimit-Remaining: 20
X-RateLimit-Reset: 4
  • X-RateLimit-Limit: Maximum requests allowed (burst + 1)
  • X-RateLimit-Remaining: Requests left in the current window
  • X-RateLimit-Reset: Seconds until the limit fully resets

When a request is rate-limited (HTTP 429), an additional header appears:

Retry-After: 3

This tells the client exactly how many seconds to wait before retrying.

Directive Reference

All directives work in http, server, and location contexts.

rate_limit

Syntax: rate_limit $key requests=N period=Xm burst=N;
Default:
Context: http, server, location

The main directive enabling rate limiting. The first argument is the rate limit key — typically $remote_addr for per-IP limiting. Parameters:

  • requests — Requests allowed per period (default: 1)
  • period — Time window, for example 1m, 30s, 1h (default: 60s)
  • burst — Additional burst capacity (default: 0)
# 100 requests per minute with burst of 50
rate_limit $remote_addr requests=100 period=1m burst=50;

rate_limit_pass

Syntax: rate_limit_pass upstream_name;
Default:
Context: http, server, location

Specifies the Redis upstream for rate limit state. Accepts a static name or a variable:

rate_limit_pass redis;
rate_limit_pass $redis_upstream;

rate_limit_status

Syntax: rate_limit_status code;
Default: 429
Context: http, server, location

HTTP status code for rate-limited requests. Must be 400–599.

rate_limit_status 503;

rate_limit_headers

Syntax: rate_limit_headers on | off;
Default: off
Context: http, server, location

Enables X-RateLimit-* and Retry-After headers. When off, headers only appear on 429 responses. When on, they appear on every response.

rate_limit_prefix

Syntax: rate_limit_prefix string;
Default: ""
Context: http, server, location

Prepends a prefix to the Redis key, separated by underscore. This creates separate counters for different endpoints sharing the same key variable:

location /api/ {
    rate_limit $remote_addr requests=100 period=1m burst=50;
    rate_limit_prefix api;
    rate_limit_pass redis;
}

location /login {
    rate_limit $remote_addr requests=5 period=5m burst=2;
    rate_limit_prefix login;
    rate_limit_pass redis;
}

Without prefixes, both locations share the same counter per IP. With prefixes, api_192.168.1.1 and login_192.168.1.1 are tracked independently.

rate_limit_quantity

Syntax: rate_limit_quantity number;
Default: 1
Context: http, server, location

Tokens consumed per request. Set to 0 to check status without consuming quota:

location /api/quota {
    rate_limit $remote_addr requests=100 period=1m burst=50;
    rate_limit_prefix api;
    rate_limit_quantity 0;
    rate_limit_pass redis;
    rate_limit_headers on;
    proxy_pass http://backend;
}

rate_limit_log_level

Syntax: rate_limit_log_level info | notice | warn | error;
Default: error
Context: http, server, location

Logging severity when a request is rate-limited. Use warn or notice for high-traffic endpoints to reduce log noise.

rate_limit_connect_timeout

Syntax: rate_limit_connect_timeout time;
Default: 60s
Context: http, server, location

Timeout for establishing a Redis connection.

rate_limit_send_timeout

Syntax: rate_limit_send_timeout time;
Default: 60s
Context: http, server, location

Timeout for sending the request to Redis.

rate_limit_read_timeout

Syntax: rate_limit_read_timeout time;
Default: 60s
Context: http, server, location

Timeout for reading the Redis response.

rate_limit_buffer_size

Syntax: rate_limit_buffer_size size;
Default: 4k
Context: http, server, location

Buffer size for the Redis response. The default is sufficient for normal use.

Real-World Configuration Examples

API Gateway with Tiered Rate Limits

Use NGINX’s map directive to assign different NGINX Redis rate limit keys based on authentication:

upstream redis {
    server 127.0.0.1:6379;
    keepalive 1024;
}

upstream api_backend {
    server 127.0.0.1:8080;
}

map $http_x_api_key $api_rate_key {
    default       $remote_addr;
    "~^premium-"  "premium_$http_x_api_key";
    "~^basic-"    "basic_$http_x_api_key";
}

server {
    listen 80;
    server_name api.example.com;

    rate_limit_status 429;

    location /api/v1/public/ {
        rate_limit $remote_addr requests=30 period=1m burst=10;
        rate_limit_pass redis;
        rate_limit_prefix pub;
        rate_limit_headers on;
        proxy_pass http://api_backend;
    }

    location /api/v1/ {
        rate_limit $api_rate_key requests=60 period=1m burst=30;
        rate_limit_pass redis;
        rate_limit_prefix api;
        rate_limit_headers on;
        proxy_pass http://api_backend;
    }
}

Login Brute-Force Protection

Apply aggressive limits to authentication endpoints. This is similar to how NGINX TOTP authentication protects sensitive resources:

upstream redis {
    server 127.0.0.1:6379;
    keepalive 1024;
}

upstream app_backend {
    server 127.0.0.1:3000;
}

server {
    listen 80;
    server_name example.com;

    rate_limit_status 429;

    location /login {
        rate_limit $remote_addr requests=5 period=5m burst=2;
        rate_limit_pass redis;
        rate_limit_prefix login;
        rate_limit_headers on;
        rate_limit_log_level warn;
        proxy_pass http://app_backend;
    }

    location /api/auth/token {
        rate_limit $remote_addr requests=10 period=10m burst=3;
        rate_limit_pass redis;
        rate_limit_prefix auth;
        rate_limit_headers on;
        proxy_pass http://app_backend;
    }

    location / {
        proxy_pass http://app_backend;
    }
}

Each IP can attempt only 3 logins immediately (burst + 1), then 5 more over 5 minutes. The Retry-After header tells clients when to try again.

Whitelisting Internal Networks

Use NGINX’s geo and map directives to skip rate limiting for trusted networks:

upstream redis {
    server 127.0.0.1:6379;
    keepalive 1024;
}

upstream app_backend {
    server 127.0.0.1:8080;
}

geo $limit {
    default 1;
    10.0.0.0/8 0;
    172.16.0.0/12 0;
    192.168.0.0/16 0;
}

map $limit $limit_key {
    0 "";
    1 $remote_addr;
}

server {
    listen 80;
    server_name example.com;

    location / {
        rate_limit $limit_key requests=30 period=1m burst=15;
        rate_limit_pass redis;
        rate_limit_headers on;
        rate_limit_status 429;
        proxy_pass http://app_backend;
    }
}

When $limit_key is empty, the rate limit check is skipped. Internal services communicate freely while external traffic remains protected.

Production Timeout Tuning

In high-traffic environments, reduce Redis timeouts to fail fast:

upstream redis {
    server 127.0.0.1:6379;
    keepalive 1024;
}

server {
    listen 80;
    server_name example.com;

    rate_limit_connect_timeout 100ms;
    rate_limit_send_timeout 100ms;
    rate_limit_read_timeout 100ms;

    location /api/ {
        rate_limit $remote_addr requests=60 period=1m burst=30;
        rate_limit_pass redis;
        rate_limit_headers on;
        rate_limit_status 429;
        proxy_pass http://api_backend;
    }
}

Important Caveats

The return Directive Bypasses Rate Limiting

The NGINX return directive runs during the rewrite phase, before the preaccess phase where rate limiting executes. As a result, return short-circuits the rate limit check entirely:

# WARNING: rate limiting will NOT work here!
location /api {
    rate_limit $remote_addr requests=10 period=1m burst=5;
    rate_limit_pass redis;
    return 200 "OK";  # Bypasses rate limiting!
}

Instead, use proxy_pass or try_files:

# CORRECT: rate limiting works with proxy_pass
location /api {
    rate_limit $remote_addr requests=10 period=1m burst=5;
    rate_limit_pass redis;
    proxy_pass http://backend;
}

SELinux on RHEL-Based Systems

If NGINX cannot connect to Redis, you may see “Permission denied” errors in the NGINX error log. SELinux blocks the connection by default. Allow it with:

sudo setsebool -P httpd_can_network_connect 1

Redis High Availability

If Redis becomes unavailable, the module returns HTTP 503 (Service Temporarily Unavailable). Handle this gracefully with a fallback location:

error_page 502 503 = @fallback;

location @fallback {
    # Allow the request through without rate limiting
    proxy_pass http://backend;
}

Note that both 502 and 503 should be caught — the module may return either depending on the failure mode. For production, consider Redis Sentinel or Redis Cluster for automatic failover.

Testing Your Configuration

After configuring rate limiting, verify everything works. First, test syntax:

sudo nginx -t

Then reload and send test requests:

sudo systemctl reload nginx

Observe the NGINX Redis rate limit headers by sending a burst:

for i in $(seq 1 10); do
    echo "=== Request $i ==="
    curl -sI http://localhost/api/ | grep -E "HTTP/|X-RateLimit|Retry"
done

You should see X-RateLimit-Remaining decrease with each request. Eventually, HTTP 429 responses appear with Retry-After.

Inspect Redis keys directly:

redis-cli KEYS "*"

Peek at a key’s state without consuming a token:

redis-cli RATER.LIMIT mykey 10 30 60 0

Performance Considerations

Each rate-limited request adds one Redis round-trip. To minimize latency:

  • Use keepalive connections: The keepalive 1024 directive avoids TCP handshake overhead
  • Run Redis locally: A local round-trip adds less than 0.1ms
  • Tune timeouts: Set connect/send/read timeouts to 100–500ms so Redis outages do not block requests
  • Rate limit selectively: Apply limits only to sensitive endpoints (APIs, login pages, forms)

Troubleshooting

“unknown directive rate_limit”

The module is not loaded. Add to nginx.conf:

load_module modules/ngx_http_rate_limit_module.so;

Confirm the file exists:

ls /usr/lib64/nginx/modules/ngx_http_rate_limit_module.so

502 or 503 Errors on Rate-Limited Locations

NGINX cannot reach Redis. Check these items:

  1. Redis is running: redis-cli pingPONG
  2. The redis-rate-limiter module is loaded: redis-cli MODULE LISTrater
  3. SELinux allows connections: setsebool -P httpd_can_network_connect 1
  4. The upstream address and port are correct

Rate Limiting Not Activating

Requests always pass through without limiting? Check:

  • You are not using return in the same location
  • The rate limit key is not empty (empty keys skip the check)
  • rate_limit_pass points to a valid upstream

Conclusion

The NGINX Redis rate limit module delivers distributed, centralized rate limiting across multiple NGINX instances. By leveraging the GCRA algorithm, it provides smoother throttling than simple counter approaches. The standard X-RateLimit-* headers help API clients self-throttle proactively.

For single-server setups, NGINX’s built-in limit_req may suffice. However, for multi-server architectures and API gateways, the NGINX Redis rate limit module is the right choice.

Source code: rate-limit-nginx-module on GitHub. Redis extension: redis-rate-limiter on GitHub.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.