Skip to main content

NGINX

NGINX Combined Upstreams: Failover & Sticky Sessions

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

When your backend infrastructure spans multiple clusters, data centers, or geographic regions, the standard NGINX upstream block quickly becomes insufficient. You cannot combine separate upstream groups, you cannot failover between entire clusters, and single-server upstreams silently ignore failure tracking. The NGINX Combined Upstreams module solves all of these problems with a set of powerful directives.

This module introduces a two-dimensional upstream architecture. Instead of a flat pool of servers, you can build hierarchical layers where individual upstreams retain their identity while participating in a larger failover chain. This approach is especially valuable for distributed backend systems, sticky HTTP sessions, and upstream broadcasting scenarios.

How the NGINX Combined Upstreams Module Works

The NGINX combined upstreams module operates at the configuration level, extending the upstream {} block with three new directives and introducing a new upstrand {} block.

At its core, the module provides three distinct capabilities:

  1. Upstream merging via add_upstream — populate one upstream with servers from other already-defined upstreams
  2. Singlet generation via combine_server_singlets — create individual upstreams from a multi-server pool, each with one active server and the rest as backup
  3. Super-layer failover via upstrand — chain multiple upstreams into an ordered failover sequence while preserving each upstream’s identity

Additionally, extend_single_peers fixes a subtle NGINX behavior where single-server upstreams never mark their peer as failed, and dynamic_upstrand allows runtime selection of upstrands based on variables.

Installing NGINX Combined Upstreams

RHEL, CentOS, AlmaLinux, Rocky Linux

Install the NGINX combined upstreams module from the GetPageSpeed repository:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-combined-upstreams

Then load the module by adding this line at the top of /etc/nginx/nginx.conf:

load_module modules/ngx_http_combined_upstreams_module.so;

Debian and Ubuntu

First, set up the GetPageSpeed APT repository, then install:

sudo apt-get update
sudo apt-get install nginx-module-combined-upstreams

On Debian/Ubuntu, the package handles module loading automatically. No load_module directive is needed.

Module Pages

Directive: add_upstream

The add_upstream directive populates the current upstream block with servers from another already-defined upstream. Server attributes such as weights, max_fails, and fail_timeout are preserved from the source upstream.

Syntax: add_upstream <upstream_name> [backup] [weight=N]
Context: upstream

Parameters

Parameter Description
upstream_name Name of a previously defined upstream whose servers will be copied into the current block (required)
backup Mark all imported servers as backup servers
weight=N Multiply each imported server’s weight by factor N

Example: Merging Multiple Upstreams

upstream backend_dc1 {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
}

upstream backend_dc2 {
    server 10.0.2.1:8080 weight=2;
    server 10.0.2.2:8080;
}

upstream backend_all {
    add_upstream backend_dc1;
    add_upstream backend_dc2 weight=3;
    server 10.0.3.1:8080;
    add_upstream backend_dc1 backup;
}

In this configuration, backend_all receives all servers from backend_dc1 (at their original weights), all servers from backend_dc2 (with weights multiplied by 3), a directly defined server, and another copy of backend_dc1 servers marked as backups. Therefore, this approach avoids manually duplicating server lists across multiple upstream blocks.

Important: The source upstream must be defined before the upstream that references it. Additionally, recursive references (an upstream adding itself) are detected and rejected.

Directive: combine_server_singlets

The combine_server_singlets directive creates multiple “singlet upstreams” from the servers defined so far in the current upstream. Each singlet upstream contains one active server, while all other servers are marked as backup (or optionally as down).

Syntax: combine_server_singlets [suffix] [field_width|byname] [nobackup]
Context: upstream

Parameters

Parameter Description
suffix A string appended to the host upstream name before the ordering number. Default: none
field_width An integer for zero-padding the ordering number. For example, 2 produces 01, 02, …, 10
byname Use server names instead of ordering numbers for singlet names (NGINX 1.7.2+)
nobackup Mark secondary servers as down instead of backup

Basic Example

upstream app {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
    combine_server_singlets _node_ 2;
}

This configuration generates three singlet upstreams automatically:

  • app_node_01 — server 10.0.1.1 active, others as backup
  • app_node_02 — server 10.0.1.2 active, others as backup
  • app_node_03 — server 10.0.1.3 active, others as backup

Using Server Names with byname

Since NGINX 1.7.2, you can use the byname keyword to name singlets after their active server instead of using numeric indexes:

upstream app {
    server web1.example.com:8080;
    server web2.example.com:8080;
    combine_server_singlets byname;
}

This generates appweb1.example.com_8080 and appweb2.example.com_8080 (colons in server names are replaced with underscores). You can also combine a prefix with byname:

combine_server_singlets _node_ byname;

This produces app_node_web1.example.com_8080, etc.

Singlet upstreams enable sticky HTTP sessions without NGINX Plus. Each backend server identifies itself via a cookie, and subsequent requests are routed to the same server through its singlet upstream. This is one of the most popular use cases for NGINX combined upstreams in production environments:

upstream app {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    combine_server_singlets;
}

server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://app$cookie_route;
    }
}

The proxy_pass target becomes app1 or app2 depending on the route cookie value. If a backend server sets Set-Cookie: route=1, all subsequent requests from that client go to the singlet upstream app1, where 10.0.1.1 is the active server. However, if that server fails, NGINX fails over to 10.0.1.2 (the backup in app1) and the backup server rewrites the cookie to redirect future traffic.

For a comprehensive overview of session persistence strategies, see our NGINX Sticky Sessions guide.

Directive: extend_single_peers

The extend_single_peers directive addresses a subtle NGINX behavior: when an upstream has only one server in its primary or backup group, NGINX never marks that server as failed — even when max_fails and fail_timeout are configured.

Syntax: extend_single_peers
Context: upstream
Takes no arguments.

Why This Matters

In standard NGINX, the round-robin peer selection code checks whether an upstream has only one peer. If it does, NGINX resets the failure counter after every request and skips the max_fails / fail_timeout tracking entirely. The reasoning is that with nowhere else to route traffic, marking the sole server down serves no purpose.

However, this behavior becomes problematic in upstrand configurations, where marking a single-peer upstream as failed should trigger failover to the next upstream in the chain.

Example

upstream single_backend {
    server 10.0.1.1:8080;
    extend_single_peers;
}

The directive adds a fake peer marked as down to the primary or backup group when it contains only one server. This fake peer is never used for actual traffic — it simply tricks NGINX into treating the upstream as a multi-peer pool, enabling proper failure tracking.

If a group already has multiple peers (like the primary group in the example below), the directive has no effect on that group:

upstream mixed {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080 backup;
    extend_single_peers;
}

Here, only the backup group (which has a single server 10.0.1.3) is affected.

Block: upstrand

The upstrand block is the most powerful feature of the NGINX combined upstreams module. It creates a “super-layer” that chains multiple upstreams into an ordered failover sequence. Unlike a flat combined upstream, each upstream inside an upstrand retains its full identity — its own server pool, load balancing, and backup configuration.

Syntax: upstrand <name> { ... }
Context: http

Upstrand Sub-Directives

Inside the upstrand block, the following directives are available:

Directive Syntax Description
upstream upstream <name or ~regex> [backup] [blacklist_interval=time] Add an upstream (or matching upstreams by regex) to the upstrand
order order [start_random] [per_request] Control the iteration order
next_upstream_statuses next_upstream_statuses <status_list> Define which responses trigger failover to the next upstream
next_upstream_timeout next_upstream_timeout <time> Maximum time to spend cycling through upstreams
intercept_statuses intercept_statuses <status_list> <uri> Redirect to a failover URI on specific final statuses

How Upstrand Failover Works

When you proxy through an upstrand, NGINX tries each upstream in sequence. If the response from an upstream matches any status in next_upstream_statuses, the upstrand moves to the next upstream. This continues until either:

  • An upstream returns a successful response (not matching next_upstream_statuses)
  • All upstreams (normal and backup) have been tried
  • The next_upstream_timeout expires

The upstrand’s response is whatever the last server of the current upstream returns, which is influenced by proxy_next_upstream. In other words, within each upstream, NGINX first tries all servers (and backups) per normal upstream behavior. Only when the entire upstream “fails” does the upstrand advance.

Upstream Directive Details

Upstream names starting with ~ are treated as regular expressions. All upstreams whose names match the pattern are included:

upstrand us1 {
    upstream ~^cluster_;
    upstream fallback_pool backup;
    order start_random;
    next_upstream_statuses error timeout 502 503 504;
}

This matches all upstreams named cluster_* as primary upstreams and adds fallback_pool as a backup. Only upstreams defined before the upstrand block are considered as regex candidates.

The blacklist_interval parameter temporarily removes an upstream from rotation when it returns a failing status:

upstrand us1 {
    upstream cluster_east blacklist_interval=60s;
    upstream cluster_west blacklist_interval=60s;
    next_upstream_statuses error timeout 5xx;
}

If cluster_east fails, it is blacklisted for 60 seconds. Consequently, subsequent requests skip it and go directly to cluster_west. Blacklisting state is per-worker and is not shared between NGINX worker processes. When all upstreams become blacklisted, the module resets all blacklists and retries from the beginning.

Order Directive Details

Value Description
(default) Sequential order, round-robin across requests. The starting upstream advances globally per worker.
start_random The first upstream is chosen randomly at worker startup, then advances round-robin.
per_request Each request starts from the same position (no global round-robin advancement).
start_random per_request Each request starts from a randomly chosen upstream.

Status Notation

The next_upstream_statuses and intercept_statuses directives accept:

Value Meaning
error Connection errors with upstream peers
timeout Timeout connecting to upstream peers
non_idempotent Allow retrying POST, LOCK, and PATCH requests (normally skipped)
4xx Any 4xx status code
5xx Any 5xx status code
NNN A specific HTTP status code (e.g., 204, 502, 503)

Using an Upstrand with proxy_pass

To route traffic through an upstrand, use a special variable in proxy_pass:

location /api {
    proxy_pass http://$upstrand_us1;
}

The variable name follows the pattern $upstrand_<name>, where <name> is the upstrand name. Be cautious when accessing this variable from other directives — it triggers the subrequest machinery internally, which may not be desirable in all contexts.

Complete NGINX Combined Upstreams Example

The following configuration demonstrates a multi-cluster NGINX combined upstreams setup with geographic failover:

upstream cluster_east {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
}

upstream cluster_west {
    server 10.0.2.1:8080;
    server 10.0.2.2:8080;
}

upstream cluster_backup {
    server 10.0.3.1:8080;
}

upstrand geo_failover {
    upstream ~^cluster_(east|west) blacklist_interval=30s;
    upstream cluster_backup backup;
    order start_random;
    next_upstream_statuses error timeout 502 503 5xx;
    next_upstream_timeout 30s;
}

server {
    listen 80;
    server_name example.com;

    location / {
        proxy_pass http://$upstrand_geo_failover;
    }
}

In this setup, requests are distributed across cluster_east and cluster_west with a random starting point. If one cluster fails (returning 5xx or connection errors), the upstrand advances to the other cluster. If both fail, the backup cluster_backup is tried. Failed clusters are blacklisted for 30 seconds, and the entire cycle must complete within 30 seconds.

Directive: dynamic_upstrand

The dynamic_upstrand directive allows choosing an upstrand at runtime based on variable values. This is useful when the target upstrand depends on request parameters, headers, or other dynamic data.

Syntax: dynamic_upstrand $variable $source_var [default_upstrand]
Context: server, location, if

Parameters

Parameter Description
$variable The variable name to store the resolved upstrand reference
$source_var A variable whose value should match an existing upstrand name
default_upstrand Optional literal name of an upstrand to use as fallback when $source_var is empty or does not match any upstrand

Example: Region-Based Routing

upstrand us_east {
    upstream ~^east_;
    order start_random;
    next_upstream_statuses error timeout 5xx;
}

upstrand us_west {
    upstream ~^west_;
    order start_random;
    next_upstream_statuses error timeout 5xx;
}

server {
    listen 80;
    server_name example.com;

    dynamic_upstrand $region_upstrand $arg_region us_east;

    location / {
        proxy_pass http://$region_upstrand;
    }
}

When a request arrives with ?region=us_west, the $region_upstrand variable resolves to the us_west upstrand. If ?region is missing or does not match any defined upstrand, it falls back to us_east.

Without the default fallback, an unresolvable $source_var results in an empty variable and proxy_pass returns HTTP 500.

Upstrand Status Variables

The NGINX combined upstreams module exports eight variables for monitoring upstrand behavior. These are counterparts of the standard $upstream_* variables, but they accumulate values across all upstreams visited during the request lifecycle, including subrequests.

Variable Description
$upstrand_path Path of all upstreams visited during the request
$upstrand_addr Addresses of upstream servers contacted
$upstrand_status HTTP status codes from each upstream
$upstrand_response_time Response times for each upstream
$upstrand_connect_time Connection times for each upstream
$upstrand_header_time Time to receive response headers from each upstream
$upstrand_response_length Response lengths from each upstream
$upstrand_cache_status Cache status for each upstream response

Logging Upstrand Activity

These variables are especially useful in access logs for debugging failover behavior:

log_format upstrand_log '$remote_addr - [$time_local] '
                        '"$request" $status '
                        'path=$upstrand_path '
                        'addr=$upstrand_addr '
                        'status=$upstrand_status '
                        'time=$upstrand_response_time';

server {
    listen 80;
    access_log /var/log/nginx/upstrand.log upstrand_log;

    location / {
        proxy_pass http://$upstrand_geo_failover;
    }
}

Advanced Use Case: Ordered Failover Without Round-Robin

By combining combine_server_singlets with upstrand, you can build an upstream that always starts from the first server and fails over in a deterministic order — unlike the default round-robin behavior:

upstream backends {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    combine_server_singlets _s_ nobackup;
}

upstrand ordered {
    upstream ~^backends_s_;
    order per_request;
    next_upstream_statuses error timeout 5xx;
}

server {
    listen 80;

    location / {
        proxy_pass http://$upstrand_ordered;
    }
}

Every request first attempts 10.0.1.1, and only if it fails (with a 5xx error or connection failure) does the request proceed to 10.0.1.2. The nobackup parameter in combine_server_singlets marks secondary servers as down instead of backup, which prevents NGINX from automatically trying them within each singlet upstream. As a result, failover is handled entirely by the upstrand layer.

Use Case: Upstream Broadcasting

Upstrands also enable broadcasting messages to all backend clusters. If you need to notify every cluster, configure the upstrand to always advance through all upstreams regardless of individual responses:

upstream cluster_a {
    server 10.0.1.1:8080;
}

upstream cluster_b {
    server 10.0.2.1:8080;
}

upstrand broadcast {
    upstream ~^cluster_;
    order per_request;
    next_upstream_statuses 200 204;
}

server {
    listen 80;

    location /broadcast {
        proxy_pass http://$upstrand_broadcast;
    }
}

By listing successful status codes (200, 204) in next_upstream_statuses, the upstrand treats successful responses as triggers for advancement to the next upstream. This effectively broadcasts the request to all clusters in sequence.

Performance Considerations

The upstrand mechanism uses NGINX subrequests internally to cycle through upstreams. Each subrequest incurs a small overhead, so the total latency of a request through an upstrand includes the response time of every upstream that was tried before a successful one was found.

For latency-sensitive applications, consider these optimization strategies:

  • Keep the next_upstream_timeout value tight (e.g., 5-10 seconds) to bound total request time
  • Use blacklist_interval to skip known-bad upstreams for a configurable period
  • Consider order start_random to distribute the “first try” load evenly across workers
  • The combine_server_singlets and add_upstream directives have zero runtime overhead — they only operate at configuration parsing time

Security Best Practices

When exposing dynamic_upstrand with user-controlled variables (like query parameters), you should validate the input to prevent routing requests to unintended upstreams. Use NGINX map blocks to whitelist allowed values:

map $arg_region $safe_region {
    default   "";
    us_east   us_east;
    us_west   us_west;
}

server {
    listen 80;
    dynamic_upstrand $target $safe_region us_east;

    location / {
        proxy_pass http://$target;
    }
}

This prevents users from injecting arbitrary upstrand names via query parameters.

Troubleshooting NGINX Combined Upstreams

Module Not Loading

If nginx -t reports unknown directive "add_upstream", ensure the load_module line is present and correct:

load_module modules/ngx_http_combined_upstreams_module.so;

“upstream not found” Error

The add_upstream directive requires the source upstream to be defined before the target upstream in the configuration file. NGINX processes configuration top to bottom:

# This will FAIL:
upstream combined {
    add_upstream backend;  # ERROR: backend not defined yet
}
upstream backend {
    server 10.0.1.1:8080;
}

# This works:
upstream backend {
    server 10.0.1.1:8080;
}
upstream combined {
    add_upstream backend;  # OK: backend is already defined
}

Upstrand Returns 500

If `proxy_pass http://$upstrand_name` returns 500:

  1. Verify the upstrand name in the variable matches the upstrand block name exactly
  2. For dynamic_upstrand, ensure the source variable resolves to a valid upstrand name or that a default is provided
  3. Check the NGINX error log for details: tail -f /var/log/nginx/error.log

Blacklisting Not Working as Expected

Remember that blacklisting state is per-worker. If you have multiple worker processes, one worker may blacklist an upstream while another continues routing to it. This is by design for performance (avoiding shared memory locks). For consistent behavior during testing, set worker_processes 1.

Health Checks Complement Upstrands

For proactive failure detection rather than reactive failover, combine upstrands with NGINX Active Health Checks. Health checks can mark upstream servers as down before real traffic hits them, which reduces the number of failed requests during outages.

Conclusion

The NGINX Combined Upstreams module fills critical gaps in NGINX’s upstream management. It enables merging server pools across upstreams, implementing cookie-based sticky sessions, building multi-layer failover chains, and even broadcasting requests to all backend clusters. For distributed architectures where NGINX’s flat upstream model falls short, this module provides the hierarchical control you need.

For more about NGINX upstream management, see our guides on Load Balancing, Dynamic Upstream Management, and Upstream Keepalive.

The module source code is available on GitHub.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.