Skip to main content

NGINX / Server Setup

NGINX Lua Upstream: Dynamic Management Without Reloads

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

Managing upstream servers in NGINX traditionally requires editing configuration files and reloading the service. In high-traffic production environments, every reload carries risk: brief connection drops, configuration errors that take down the entire service, and operational overhead that slows down incident response. The NGINX lua upstream module eliminates these problems by providing a Lua API for inspecting and modifying upstream server states at runtime.

This module is part of the OpenResty ecosystem and works with the NGINX Lua module. It exposes six Lua functions through the ngx.upstream package, therefore allowing you to list upstreams, query server configurations, check peer health status, and toggle servers on or off — all without touching nginx.conf or running nginx -s reload.

Why Use the NGINX Lua Upstream Module?

System administrators managing production NGINX servers face several challenges with upstream management:

  • Zero-downtime maintenance: You need to remove a backend server for maintenance without dropping active connections or reloading NGINX.
  • Real-time health visibility: You want to see which upstream peers are healthy, how many connections each handles, and when they were last checked — from a simple HTTP endpoint.
  • Automated failover: Your monitoring system detects a degraded backend and needs to programmatically mark it as down before NGINX’s passive health checks catch up.
  • Operational dashboards: Your team needs a JSON API that reports upstream status for integration with Grafana, Prometheus, or custom tooling.

The NGINX lua upstream module addresses all of these use cases. Additionally, it serves as the foundation for the lua-resty-upstream-healthcheck library, which adds active health checking capabilities to NGINX’s open-source edition.

How the NGINX Lua Upstream Module Works

Unlike most NGINX modules that add configuration directives, this module introduces no new directives at all. Instead, it registers a Lua package called ngx.upstream that you can require from any content_by_lua_block, access_by_lua_block, or similar Lua execution context.

The module reads NGINX’s internal upstream data structures directly. When you call get_primary_peers(), for example, the module iterates over the round-robin peer list maintained by NGINX’s upstream subsystem and returns the current state of each peer, including its weight, fail count, active connections, and health status.

Importantly, changes made through set_peer_down() modify NGINX’s in-memory state for the current worker process only. Since NGINX runs multiple worker processes, you must synchronize state changes across workers using a shared dictionary (ngx.shared.DICT). This design is intentional — it avoids locking overhead and keeps the module simple and fast.

Installation

RHEL, CentOS, AlmaLinux, Rocky Linux

Install the module from the GetPageSpeed RPM repository:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-lua-upstream

The module depends on nginx-module-lua, which will be installed automatically along with its dependencies (LuaJIT and the NDK module).

After installation, load the modules in /etc/nginx/nginx.conf. The order matters — NDK must come first, then the Lua module, then the lua upstream module:

load_module modules/ndk_http_module.so;
load_module modules/ngx_http_lua_module.so;
load_module modules/ngx_http_lua_upstream_module.so;

Verify the configuration is valid:

sudo nginx -t

Debian and Ubuntu

First, set up the GetPageSpeed APT repository, then install:

sudo apt-get update
sudo apt-get install nginx-module-lua-upstream

On Debian/Ubuntu, the package handles module loading automatically. No load_module directive is needed.

API Reference

All functions are accessed through the ngx.upstream Lua module:

local upstream = require "ngx.upstream"

get_upstreams

local names = upstream.get_upstreams()

Returns a Lua table containing the names of all explicitly defined upstream {} blocks. Implicit upstreams created by `proxy_pass http://hostname` (without a matching upstream block) are not included.

Example:

local us = upstream.get_upstreams()
for _, name in ipairs(us) do
    ngx.say("upstream: ", name)
end

get_servers

local servers, err = upstream.get_servers(upstream_name)

Returns the server configurations for all servers in the specified upstream group. Each server entry is a Lua table with the following fields:

Field Type Description
addr string or table Socket address(es) — a table if the server name resolves to multiple addresses
name string The server name as written in the configuration
weight number Server weight for load balancing
max_fails number Maximum number of failed attempts
fail_timeout number Period (in seconds) during which max_fails must occur
backup boolean Present and true if this is a backup server
down boolean Present and true if the server is marked as permanently down

Example:

local servers, err = upstream.get_servers("my_backend")
if not servers then
    ngx.say("error: ", err)
    return
end
for _, srv in ipairs(servers) do
    ngx.say("server: ", srv.name or srv.addr, " weight=", srv.weight)
end

get_primary_peers

local peers, err = upstream.get_primary_peers(upstream_name)

Returns runtime state information for all primary (non-backup) peers. This function reflects the live state of each peer, including metrics that change during operation. Each peer entry contains:

Field Type Description
id number Peer identifier (0-based), used with set_peer_down
name string Socket address of the peer
weight number Configured weight
current_weight number Current weight in the round-robin algorithm
effective_weight number Effective weight (decreases on failures, recovers over time)
conns number Number of active connections (requires NGINX 1.9.0+)
fails number Current failure count
max_fails number Maximum allowed failures
fail_timeout number Fail timeout period in seconds
accessed number Timestamp of last access (Unix epoch), if any
checked number Timestamp of last check (Unix epoch), if any
down boolean Present and true if the peer is marked down

get_backup_peers

local peers, err = upstream.get_backup_peers(upstream_name)

Returns the same structure as get_primary_peers, but for backup peers only. Returns an empty table if there are no backup peers defined.

set_peer_down

local ok, err = upstream.set_peer_down(upstream_name, is_backup, peer_id, down_value)

Toggles the down status of a specific peer. Parameters:

  • upstream_name (string): Name of the upstream group
  • is_backup (boolean): true for backup peers, false for primary peers
  • peer_id (number): The peer’s id field (0-based index)
  • down_value (boolean): true to mark the peer as down, false to bring it back up

Important: This function only affects the current NGINX worker process. See the Cross-Worker Synchronization section for how to propagate changes to all workers.

When a peer is brought back up (by passing false), its fail counter is automatically reset to zero.

current_upstream_name

local name = upstream.current_upstream_name()

Returns the name of the upstream group being used for the current request. Returns nil if the request is not being proxied. Unlike get_upstreams(), this function does return implicit upstreams created by proxy_pass.

This function must be called during or after the content phase. Calling it earlier returns nil.

Practical Examples

Upstream Status Dashboard

The most common use case is building an HTTP endpoint that reports the health and state of all upstream servers in JSON format. This endpoint can be consumed by monitoring systems like Grafana or Prometheus.

This example requires the lua5.1-cjson package for JSON encoding:

sudo dnf install lua5.1-cjson
upstream my_backend {
    server 10.0.0.1:8080 weight=5;
    server 10.0.0.2:8080 weight=3;
    server 10.0.0.3:8080 backup;
}

server {
    listen 8080;

    location = /upstream-status {
        default_type application/json;
        content_by_lua_block {
            local upstream = require "ngx.upstream"
            local cjson = require "cjson"
            local result = {}

            local us_list = upstream.get_upstreams()
            for _, us_name in ipairs(us_list) do
                local peers, err = upstream.get_primary_peers(us_name)
                if peers then
                    local backup_peers = upstream.get_backup_peers(us_name)
                    result[us_name] = {
                        primary = peers,
                        backup = backup_peers or {}
                    }
                end
            end

            ngx.say(cjson.encode(result))
        }
    }
}

Access the dashboard:

curl http://localhost:8080/upstream-status | python3 -m json.tool

The response includes live metrics for every peer: active connections, fail counts, weights, and timestamps of the last access and health check.

Graceful Server Removal

When you need to take a backend server offline for maintenance, you can mark it as down through an HTTP request instead of editing nginx.conf:

server {
    listen 8080;

    location = /upstream-down {
        allow 127.0.0.1;
        deny all;
        content_by_lua_block {
            local upstream = require "ngx.upstream"
            local us_name = ngx.var.arg_upstream
            local peer_id = tonumber(ngx.var.arg_peer)
            local down = ngx.var.arg_down == "1"

            if not us_name or not peer_id then
                ngx.status = 400
                ngx.say("usage: ?upstream=NAME&peer=ID&down=1|0")
                return
            end

            local ok, err = upstream.set_peer_down(us_name, false,
                                                    peer_id, down)
            if not ok then
                ngx.status = 500
                ngx.say("failed: ", err)
                return
            end

            ngx.say("peer ", peer_id, " in ", us_name,
                    down and " marked DOWN" or " marked UP")
        }
    }
}

Mark peer 0 of upstream my_backend as down:

curl "http://localhost:8080/upstream-down?upstream=my_backend&peer=0&down=1"

Bring it back up after maintenance:

curl "http://localhost:8080/upstream-down?upstream=my_backend&peer=0&down=0"

Listing All Upstream Servers

This example creates a plain-text endpoint that displays all upstream groups and their server configurations in a human-readable format:

location = /upstreams {
    default_type text/plain;
    content_by_lua_block {
        local upstream = require "ngx.upstream"
        local get_servers = upstream.get_servers
        local get_upstreams = upstream.get_upstreams

        local us = get_upstreams()
        for _, u in ipairs(us) do
            ngx.say("upstream ", u, ":")
            local srvs, err = get_servers(u)
            if not srvs then
                ngx.say("  failed to get servers: ", err)
            else
                for _, srv in ipairs(srvs) do
                    local parts = {}
                    parts[#parts + 1] = "  server " .. (srv.name or srv.addr)
                    parts[#parts + 1] = "weight=" .. srv.weight
                    parts[#parts + 1] = "max_fails=" .. srv.max_fails
                    parts[#parts + 1] = "fail_timeout=" .. srv.fail_timeout .. "s"
                    if srv.backup then
                        parts[#parts + 1] = "backup"
                    end
                    if srv.down then
                        parts[#parts + 1] = "DOWN"
                    end
                    ngx.say(table.concat(parts, " "))
                end
            end
        end
    }
}

Cross-Worker Synchronization

The set_peer_down() function modifies peer state in the current worker process only. NGINX typically runs multiple worker processes, so changes made in one worker are invisible to others. To propagate changes across all workers, use ngx.shared.DICT:

http {
    lua_shared_dict upstream_state 1m;

    upstream my_backend {
        server 10.0.0.1:8080;
        server 10.0.0.2:8080;
    }

    init_worker_by_lua_block {
        local upstream = require "ngx.upstream"
        local dict = ngx.shared.upstream_state

        -- Check for state changes every 2 seconds
        local function sync_upstream_state()
            local us_list = upstream.get_upstreams()
            for _, us_name in ipairs(us_list) do
                local peers = upstream.get_primary_peers(us_name)
                if peers then
                    for _, peer in ipairs(peers) do
                        local key = us_name .. ":" .. peer.id
                        local expected_down = dict:get(key)
                        if expected_down ~= nil then
                            local is_down = peer.down or false
                            if (expected_down == 1) ~= is_down then
                                upstream.set_peer_down(us_name, false,
                                                       peer.id,
                                                       expected_down == 1)
                            end
                        end
                    end
                end
            end
        end

        -- Run synchronization on a recurring timer
        local ok, err = ngx.timer.every(2, function(premature)
            if premature then return end
            sync_upstream_state()
        end)
        if not ok then
            ngx.log(ngx.ERR, "failed to create sync timer: ", err)
        end
    }

    server {
        listen 8080;

        location = /set-peer {
            allow 127.0.0.1;
            deny all;
            content_by_lua_block {
                local upstream = require "ngx.upstream"
                local dict = ngx.shared.upstream_state
                local us_name = ngx.var.arg_upstream
                local peer_id = tonumber(ngx.var.arg_peer)
                local down = ngx.var.arg_down == "1"

                -- Store in shared dict for cross-worker sync
                local key = us_name .. ":" .. peer_id
                dict:set(key, down and 1 or 0)

                -- Apply to current worker immediately
                upstream.set_peer_down(us_name, false, peer_id, down)

                ngx.say("peer state updated (syncs to all workers within 2s)")
            }
        }
    }
}

This pattern stores the desired peer state in a shared dictionary. Each worker checks the dictionary every 2 seconds and applies any pending changes. The shared dictionary is visible to all worker processes because it resides in shared memory.

Comparison with NGINX Plus Upstream API

NGINX Plus includes a commercial Upstream API that provides similar functionality through a REST interface. Here is how the lua upstream module compares:

Feature Lua Upstream Module NGINX Plus API
List upstreams get_upstreams() GET /api/upstream
View peer status get_primary_peers() GET /api/upstream/{name}/servers
Mark peer down set_peer_down() PATCH /api/upstream/{name}/servers/{id}
Add/remove servers Not supported Supported
Shared across workers Manual (shared dict) Automatic (zone directive)
Requires reload No No
License BSD (free) Commercial

The NGINX lua upstream module covers the most critical operational needs — inspecting and toggling peers. For adding or removing upstream servers dynamically at runtime, consider the NGINX Dynamic Upstream module as an alternative.

Integration with Active Health Checks

The lua-resty-upstream-healthcheck library builds on this module to provide active health checking for NGINX upstreams. Instead of waiting for real traffic to trigger passive health checks, it sends periodic probe requests to each upstream server and automatically marks unhealthy peers as down.

Install the library:

sudo dnf install lua5.1-resty-upstream-healthcheck

Note: Use the lua5.1- prefixed package, not lua-resty-upstream-healthcheck. The lua5.1- variant installs to the LuaJIT-compatible path that NGINX’s Lua module uses. The unprefixed package installs to the system Lua path, which NGINX cannot find.

Configure active health checks:

http {
    lua_shared_dict healthcheck 1m;

    upstream my_backend {
        server 10.0.0.1:8080;
        server 10.0.0.2:8080;
    }

    init_worker_by_lua_block {
        local hc = require "resty.upstream.healthcheck"
        local ok, err = hc.spawn_checker{
            shm = "healthcheck",
            upstream = "my_backend",
            type = "http",
            http_req = "GET /health HTTP/1.0\r\nHost: localhost\r\n\r\n",
            interval = 2000,
            timeout = 1000,
            fall = 3,
            rise = 2,
            valid_statuses = {200, 302},
        }
        if not ok then
            ngx.log(ngx.ERR, "healthcheck error: ", err)
        end
    }

    server {
        listen 8080;

        location = /health-status {
            default_type text/plain;
            content_by_lua_block {
                local hc = require "resty.upstream.healthcheck"
                ngx.say("Upstream healthcheck status:")
                ngx.print(hc.status_page())
            }
        }

        location / {
            proxy_pass http://my_backend;
        }
    }
}

The health checker runs in every worker process and uses set_peer_down() from the lua upstream module internally. Because it operates within each worker, the health status is consistent across all workers without additional synchronization.

Performance Considerations

  • Read operations are fast: Functions like get_upstreams(), get_primary_peers(), and get_servers() read directly from NGINX’s in-memory data structures. They involve no I/O, no system calls, and no locks. They are safe to call on every request if needed.
  • Write operations are per-worker: set_peer_down() modifies a boolean flag in the worker’s memory. It is O(1) and essentially free in terms of performance.
  • Shared dict synchronization adds minimal overhead: The ngx.timer.every approach for cross-worker sync adds a few microseconds of overhead per timer invocation — negligible compared to request processing time.
  • No impact on proxying: The module does not intercept or modify the request processing pipeline. It only provides a Lua API. Unless you call the API functions, the module has zero performance impact.

Troubleshooting

“upstream not found” Error

This error occurs when the upstream name passed to a function does not match any upstream {} block in the configuration:

local peers, err = upstream.get_primary_peers("nonexistent")
-- peers is nil, err is "upstream not found"

Solution: Verify the upstream name matches exactly. Use get_upstreams() to list all available upstream names. Remember that implicit upstreams from proxy_pass are not included.

Changes Not Persisting After Reload

Changes made via set_peer_down() exist only in NGINX’s runtime memory. They are lost when NGINX is reloaded or restarted.

Solution: If you need persistent peer states, store them in an external data store (Redis, a file, or a database) and apply them on startup using init_worker_by_lua_block.

Changes Only Affect One Worker

The set_peer_down() function modifies state in the current worker process only.

Solution: Use the shared dictionary synchronization pattern described in the Cross-Worker Synchronization section above.

Module Not Found Error

If you see module 'ngx.upstream' not found, the lua upstream module is not loaded.

Solution: Ensure load_module modules/ngx_http_lua_upstream_module.so; appears in nginx.conf after the Lua module line. The load order must be: NDK, then Lua, then lua upstream.

Empty Peer List

get_primary_peers() returns nil with "no peer data" if the upstream block has no servers or uses a non-standard balancer.

Solution: Verify the upstream block contains at least one server directive and uses the default round-robin balancer.

Security Best Practices

  • Restrict access to management endpoints: Always use allow and deny directives to limit who can call set_peer_down(). Exposing this endpoint publicly would let attackers take your backends offline.
  • Use a separate management server block: Run management endpoints on a different port (for example, listen 127.0.0.1:9090) that is not exposed to the internet.
  • Log state changes: Add ngx.log(ngx.WARN, ...) calls when peers are toggled, so you have an audit trail in your error log.
  • Combine with authentication: For remote management, add HTTP Basic authentication or token-based auth before the Lua block.

Conclusion

The NGINX lua upstream module brings runtime upstream visibility and control to NGINX’s open-source edition. By providing a simple six-function Lua API, it enables system administrators to build health dashboards, implement graceful server maintenance workflows, and integrate with external monitoring systems — all without reloading NGINX.

For installations that need active health checking, combine this module with the lua-resty-upstream-healthcheck library. For full dynamic server addition and removal, see the NGINX Dynamic Upstream module.

The module source code is available on GitHub, and pre-built packages are available from the GetPageSpeed repository.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.