NGINX Upstream Keepalive: Connection Pooling Guide [2026]

Danila Vershinin

2 months ago

NGINX Upstream Keepalive: Connection Pooling Guide [2026]

When NGINX proxies requests to upstream servers, each request typically opens a new TCP connection, completes the request, and closes the connection. This behavior creates significant overhead, especially in high-traffic environments with microservices or API gateways. NGINX upstream keepalive solves this by maintaining a pool of persistent connections to your backend servers, dramatically reducing latency and preventing socket exhaustion.

If you’ve ever seen thousands of TIME_WAIT sockets piling up between NGINX and your application servers, or noticed that your API response times are slower than expected, upstream keepalive is likely the solution you need.

What is NGINX Upstream Keepalive?

NGINX upstream keepalive is a connection pooling mechanism that maintains persistent HTTP connections between NGINX and upstream backend servers. Instead of opening a new TCP connection for every proxied request, NGINX reuses existing connections from a pool, eliminating the overhead of TCP handshakes and connection setup.

This is fundamentally different from client-side keepalive (controlled by keepalive_timeout in the server context), which manages connections between web browsers and NGINX. Upstream keepalive operates on the backend side, managing connections from NGINX to your application servers, API endpoints, or microservices.

For PHP applications using FastCGI, see our dedicated guide on NGINX FastCGI Keepalive which covers the fastcgi_keep_conn directive.

The Cost of Connection Churn

Without upstream keepalive, every proxied request incurs:

TCP three-way handshake: 1-2 round trips (10-100ms depending on network latency)
TLS handshake (if using HTTPS): Additional 1-2 round trips
Connection teardown: 4-way FIN/ACK sequence
TIME_WAIT accumulation: Closed connections linger for 60 seconds (default on Linux)

For a service handling 10,000 requests per second to a backend, this means 10,000 new TCP connections per second, leading to massive socket churn and potential port exhaustion.

Enabling Upstream Keepalive in NGINX

The upstream keepalive feature is built into NGINX core and documented in the official NGINX upstream module documentation. You enable it by adding the keepalive directive inside an upstream block:

upstream backend {
    server 10.0.0.1:8080;
    server 10.0.0.2:8080;

    keepalive 32;
}

However, this alone is not enough. NGINX defaults to HTTP/1.0 for upstream connections, which doesn’t support persistent connections. You must explicitly configure HTTP/1.1 and clear the Connection header:

server {
    listen 80;

    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

The proxy_set_header Connection "" directive is critical. Without it, NGINX forwards the client’s Connection header, which browsers typically set to keep-alive or close, interfering with the upstream connection reuse logic.

For a complete introduction to NGINX proxying, see our NGINX Reverse Proxy guide.

Understanding the Keepalive Pool

The keepalive directive specifies the maximum number of idle connections to keep in the pool per worker process. Understanding this architecture is essential for proper tuning.

Per-Worker Connection Pools

NGINX runs multiple worker processes (typically one per CPU core). Each worker maintains its own independent keepalive connection pool. If you configure:

keepalive 32;

And NGINX has 8 worker processes, the total number of keepalive connections across all workers could be up to 256 (32 x 8 workers).

The pool operates as an LRU (Least Recently Used) cache:

When a connection is released after a request completes, it’s added to the pool
When a new request needs a backend connection, NGINX first checks the pool for an existing connection to that specific server
If the pool is full, the oldest idle connection is closed to make room for the new one

Connection Matching

Connections in the pool are matched by socket address (IP and port). If your upstream has multiple servers:

upstream api_cluster {
    server 10.0.0.1:8080;
    server 10.0.0.2:8080;
    server 10.0.0.3:8080;

    keepalive 32;
}

NGINX will only reuse a connection if the load balancer selects the same backend server that the cached connection was established with.

Tuning Keepalive Parameters

NGINX provides four directives for fine-tuning upstream keepalive behavior. These must be placed inside the upstream block.

keepalive

keepalive 64;

Sets the maximum number of idle keepalive connections to upstream servers preserved in the cache of each worker process. The value should be set high enough to handle your request rate without constant connection churn, but not so high that you exhaust backend server connection limits.

Calculation guideline: Consider your peak requests per second per worker, divided by requests per connection, multiplied by a safety factor:

pool_size = (requests_per_second / workers) / avg_requests_per_connection * 2

For a system handling 10,000 req/s across 8 workers with an average of 100 requests per connection:

pool_size = (10000 / 8) / 100 * 2 = 25

A value of 32 would provide adequate headroom.

keepalive_timeout

keepalive_timeout 60s;

Sets the timeout during which an idle keepalive connection to an upstream server will stay open. Connections idle longer than this value are closed. The default is 60 seconds.

Increase this value if your traffic is bursty with quiet periods:

keepalive_timeout 120s;

Decrease it if your backends have strict connection limits and you need faster connection recycling:

keepalive_timeout 30s;

keepalive_requests

keepalive_requests 1000;

Sets the maximum number of requests that can be served through one keepalive connection. After this number is reached, the connection is closed. The default value changed from 100 to 1000 in NGINX 1.19.10.

This limit exists because some backends may leak memory or resources over many requests. For well-behaved backends, you can increase this significantly:

keepalive_requests 10000;

keepalive_time

keepalive_time 1h;

Limits the maximum time during which requests can be processed through one keepalive connection. After this time, the connection is closed following the next request processing. The default is 1 hour.

This directive was introduced in NGINX 1.19.10 to handle scenarios where backends need periodic connection refreshing (for example, to pick up DNS changes or redistribute load after backend restarts).

Complete Configuration Example

Here’s a production-ready configuration for a high-performance API gateway:

upstream api_backend {
    server 10.0.0.1:8080 weight=5;
    server 10.0.0.2:8080 weight=3;
    server 10.0.0.3:8080 backup;

    keepalive 64;
    keepalive_timeout 60s;
    keepalive_requests 10000;
    keepalive_time 1h;
}

server {
    listen 80;
    server_name api.example.com;

    location / {
        proxy_pass http://api_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts
        proxy_connect_timeout 5s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
}

Preventing TIME_WAIT Socket Exhaustion

One of the most compelling reasons to enable upstream keepalive is preventing TIME_WAIT socket accumulation. When NGINX closes a connection to an upstream server, that socket enters the TIME_WAIT state for 60 seconds (the default tcp_fin_timeout on Linux).

Monitoring TIME_WAIT Sockets

Check current TIME_WAIT connections using ss:

ss -tan state time-wait | wc -l

Or view a summary of all TCP socket states:

ss -s

Example output:

Total: 1452
TCP:   1203 (estab 89, closed 0, orphaned 0, timewait 1100)

Transport Total     IP        IPv6
TCP       1203      1105      98

High TIME_WAIT counts (thousands) between NGINX and backends indicate connection churn that keepalive can eliminate.

Verifying Keepalive is Working

To confirm connections are being reused, check ESTABLISHED connections to your backend port:

ss -tan state established dst 10.0.0.1:8080 | wc -l

With keepalive properly configured, you should see a stable, small number of established connections even under high load, rather than connections constantly being created and destroyed.

Load Balancing Considerations

Upstream keepalive works with all NGINX load balancing methods, but the interaction deserves consideration. For a detailed explanation of load balancing algorithms, see our NGINX Load Balancing guide.

Round Robin (Default)

upstream backend {
    server 10.0.0.1:8080;
    server 10.0.0.2:8080;

    keepalive 32;
}

With round robin, requests are distributed sequentially across backends. Keepalive connections will be distributed roughly equally if traffic is steady.

Least Connections

upstream backend {
    least_conn;
    server 10.0.0.1:8080;
    server 10.0.0.2:8080;

    keepalive 32;
}

The least_conn method works well with keepalive. It directs requests to the server with the fewest active connections, promoting even distribution.

IP Hash

upstream backend {
    ip_hash;
    server 10.0.0.1:8080;
    server 10.0.0.2:8080;

    keepalive 32;
}

With ip_hash, clients consistently connect to the same backend. Keepalive connections will match this affinity, potentially creating an uneven distribution of cached connections across backends.

HTTPS Upstream Connections

When proxying to HTTPS backends, keepalive provides even greater benefits by avoiding repeated TLS handshakes:

upstream secure_backend {
    server 10.0.0.1:443;
    server 10.0.0.2:443;

    keepalive 32;
}

server {
    listen 80;

    location / {
        proxy_pass https://secure_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";

        # SSL settings for upstream
        proxy_ssl_verify on;
        proxy_ssl_trusted_certificate /etc/pki/tls/certs/ca-bundle.crt;
        proxy_ssl_session_reuse on;
    }
}

The proxy_ssl_session_reuse on directive (enabled by default) allows NGINX to reuse SSL sessions, further reducing TLS overhead on kept-alive connections.

gRPC and HTTP/2 Backends

For gRPC or HTTP/2 backends, NGINX 1.17.5+ supports keepalive with grpc_pass:

upstream grpc_backend {
    server 10.0.0.1:50051;

    keepalive 32;
}

server {
    listen 80 http2;

    location / {
        grpc_pass grpc://grpc_backend;
        grpc_set_header Connection "";
    }
}

Note that HTTP/2 multiplexes multiple requests over a single connection, so the keepalive pool may contain fewer active connections than with HTTP/1.1.

Troubleshooting Upstream Keepalive

Connections Not Being Reused

If connections aren’t being kept alive, verify:

HTTP version: Ensure proxy_http_version 1.1; is set
Connection header: Ensure proxy_set_header Connection ""; is set
Backend support: Confirm your backend supports HTTP/1.1 keepalive and isn’t sending Connection: close
Request body: Connections with unread request bodies cannot be cached

Pool Exhaustion

If you see increased latency under load, the keepalive pool may be too small. Monitor with NGINX Plus status or by observing connection creation rate:

# Watch new connections to backend
ss -tn state syn-sent dst 10.0.0.1:8080

Frequent SYN_SENT sockets indicate new connections being established, suggesting pool exhaustion.

Backend Connection Limits

If backends reject connections, you may have configured too large a keepalive pool:

# Too aggressive for a backend with 100 connection limit
upstream backend {
    server 10.0.0.1:8080;
    keepalive 128;  # 128 x 8 workers = 1024 potential connections
}

Calculate your maximum connections as keepalive * worker_processes and ensure it doesn’t exceed backend limits.

If you’re experiencing 504 Gateway Timeout errors, upstream keepalive can help by reducing connection establishment time, but you may also need to adjust proxy_read_timeout for slow backends.

Source Code Insights

Examining the NGINX source code reveals implementation details that inform optimal configuration. The keepalive module (ngx_http_upstream_keepalive_module.c) uses a doubly-linked queue for O(1) connection caching operations.

Key implementation behaviors:

LRU eviction: When the pool is full, the oldest connection (queue tail) is closed
Address matching: Connections match by socket address only, not session state
Validation checks: Connections are only cached if: no errors occurred, keepalive_requests limit not reached, keepalive_time not exceeded, and the upstream indicated keepalive support
Idle detection: Uses MSG_PEEK to detect if backends unexpectedly sent data or closed connections

The default values in the source code are:

keepalive_timeout: 60000ms (60 seconds)
keepalive_requests: 1000
keepalive_time: 3600000ms (1 hour)

Combining with Proxy Caching

For even better performance, combine upstream keepalive with NGINX proxy caching. Keepalive reduces connection overhead while proxy caching eliminates redundant backend requests entirely.

Performance Impact

Enabling upstream keepalive typically provides:

Latency reduction: 10-100ms per request (eliminates TCP handshake)
TLS savings: Additional 10-50ms for HTTPS backends
Socket reduction: Eliminates TIME_WAIT accumulation
Backend efficiency: Fewer connection setup/teardown operations on application servers
Memory savings: Fewer connection objects on both NGINX and backends

For microservices architectures where NGINX serves as an API gateway handling thousands of requests per second to multiple backend services, the cumulative impact is substantial.

Summary

NGINX upstream keepalive is essential for production deployments proxying to backend servers. Proper configuration requires:

Enable keepalive in the upstream block with an appropriate pool size
Set proxy_http_version 1.1; for HTTP/1.1 connections
Clear the Connection header with proxy_set_header Connection "";
Tune keepalive_timeout, keepalive_requests, and keepalive_time for your workload
Monitor TIME_WAIT sockets and connection reuse to verify effectiveness

With these settings in place, you’ll see reduced latency, eliminated socket exhaustion, and improved overall system stability for your NGINX-backed services.