You reload NGINX. HTTP/2 traffic keeps flowing. But HTTP/3? Roughly half your QUIC connections silently die. No error logs. No warnings. Just timeouts that look like network issues.
This nginx HTTP/3 reload bug has existed since QUIC support was added to NGINX. F5 knows about it — there’s an open issue, multiple community PRs. They just won’t merge a fix.
We did. nginx-mod release 37 ships the patch.
The Symptoms Nobody Warned You About
Here’s what happens when you run nginx -s reload on a server with HTTP/3 enabled and quic_bpf on:
- QUIC handshakes silently fail. The client sends an Initial packet, the server never responds. Zero bytes received on the client side.
curl --http3-onlytimes out approximately 50% of the time after reload. Works perfectly after a full restart.- No error in NGINX logs. Nothing in
error.log. Nothing at any log level. The packets simply vanish. - The failure rate scales with
worker_processes. With 2 workers, ~50% of new connections fail. With 4 workers, ~75%. With 8, ~87.5%. The math is simple: only 1 out of N workers is the new one that can actually handle new QUIC connections.
This is catastrophic for any production deployment that uses nginx -s reload for configuration changes — which is every production deployment. You push a config update, and suddenly a large fraction of your HTTP/3 users experience timeouts.
How QUIC Reuseport Works in NGINX
To understand the bug, you need to understand how NGINX handles QUIC sockets with SO_REUSEPORT.
When reuseport is specified in the listen directive, each worker process gets its own socket bound to the same address:port. The kernel distributes incoming packets across these sockets. For TCP, this is straightforward — the kernel hashes by source IP/port.
QUIC is different. A single UDP socket handles multiple logical connections, identified by Connection IDs (CIDs). NGINX uses an eBPF program (enabled by quic_bpf on) to route packets to the correct worker based on the CID embedded in the QUIC packet.
The critical piece is in src/event/quic/ngx_event_quic_bpf.c. When the BPF program is attached to a reuseport group, NGINX sets a flag on the listening socket:
/* do not inherit this socket */
ls->ignore = 1;
This ignore = 1 flag tells the NGINX reload mechanism: “Don’t pass this socket to the new worker processes. Create fresh sockets instead.” The idea is sound — new workers need new sockets with a fresh BPF map that routes CIDs to the correct new worker PIDs.
The problem is what happens to the old sockets.
The Root Cause: An 8-Line Oversight
During graceful shutdown, NGINX calls ngx_close_listening_sockets() in src/core/ngx_connection.c. This function closes listening sockets so the old worker stops accepting new connections while it finishes processing existing ones.
Here’s what the upstream code looks like:
void
ngx_close_listening_sockets(ngx_cycle_t *cycle)
{
ngx_uint_t i;
ngx_listening_t *ls;
ngx_connection_t *c;
/* ... */
ls = cycle->listening.elts;
for (i = 0; i < cycle->listening.nelts; i++) {
#if (NGX_QUIC)
if (ls[i].quic) {
continue; /* <-- THE BUG */
}
#endif
c = ls[i].connection;
if (c) {
if (c->read->active) {
/* ... delete event ... */
}
ngx_free_connection(c);
c->fd = (ngx_socket_t) -1;
}
if (ngx_close_socket(ls[i].fd) == -1) {
/* ... error ... */
}
}
}
See it? if (ls[i].quic) { continue; } — all QUIC listening sockets are unconditionally skipped. They are never closed during graceful shutdown.
For non-reuseport QUIC sockets, this makes sense. The old worker needs to keep its QUIC socket open to finish servicing existing QUIC connections (sending retransmissions, processing ACKs, completing graceful connection shutdown).
But for reuseport QUIC sockets with BPF, it’s a disaster. Here’s why:
- Reload happens. New worker processes start with fresh sockets and a fresh BPF map.
- Old workers enter graceful shutdown but their reuseport sockets stay open.
- The kernel’s reuseport group now contains both old and new sockets. For new QUIC Initial packets (which have no CID in the BPF map yet), the kernel falls back to hash-based distribution across all sockets in the reuseport group.
- New Initial packets land on old worker sockets ~(N-1)/N of the time (where N is the total number of workers, old + new). The old workers are in
ngx_exitingstate and silently drop these packets. - The client sees a timeout. No QUIC handshake completes. No error. Just silence.
This is not a complex race condition. It is a simple oversight — QUIC sockets are skipped without checking whether they use reuseport. Any code review should have caught this.
Reproducing the Bug
You can reproduce this in minutes on any Linux system with NGINX built with QUIC support.
Configuration:
worker_processes 2;
events {
worker_connections 1024;
}
http {
quic_bpf on;
server {
listen 443 quic reuseport;
listen 443 ssl;
ssl_certificate /etc/ssl/certs/example.crt;
ssl_certificate_key /etc/ssl/private/example.key;
location / {
return 200 "OK\n";
}
}
}
Test script:
#!/bin/bash
# Fresh restart — baseline
sudo nginx -s stop && sudo nginx
echo "=== After restart ==="
for i in $(seq 1 20); do
curl -s --http3-only -m 2 https://localhost/ > /dev/null 2>&1 && echo "OK" || echo "FAIL"
done | sort | uniq -c
# Reload — triggers the bug
sudo nginx -s reload
sleep 1
echo "=== After reload ==="
for i in $(seq 1 20); do
curl -s --http3-only -m 2 https://localhost/ > /dev/null 2>&1 && echo "OK" || echo "FAIL"
done | sort | uniq -c
Typical output with unpatched NGINX:
=== After restart ===
20 OK
=== After reload ===
10 FAIL
10 OK
20/20 after restart. ~10/20 after reload. The failure rate is consistent and predictable.
You can verify the stale sockets using system tools:
# Show reuseport sockets — old worker PIDs still present
ss -ulnp sport = :443
# Show BPF map entries — stale entries pointing to old worker sockets
bpftool map dump name ngx_quic_sockmap
After reload, you’ll see UDP sockets owned by both old (exiting) and new worker PIDs in the reuseport group. The BPF map only knows about new workers, so Initial packets (with no CID mapping) hash-distribute across all sockets — including the dead ones.
F5’s Response — Or Lack Thereof
This bug is tracked in the nginx issue tracker. It’s been open since 2025. The community hasn’t just reported it — they’ve submitted actual fixes:
- PR #503 — A comprehensive fix implementing Magic CID + Retry mechanism. Unreviewed.
- PR #298 — An earlier approach. Closed without merge.
The NGINX team’s response? “We are planning to finish and commit our fix one day.” No timeline. No priority. No urgency for a bug that silently drops production traffic.
Meanwhile, Angie — the Russian fork of NGINX — fixed this in version 1.11.0 (December 2025) with a complete BPF redesign. A community fork, with fewer resources than F5, shipped a production fix months ago.
This is a pattern. Since F5 acquired NGINX in 2019, open-source NGINX has become a neglected vehicle for selling NGINX Plus. Critical bugs in the open-source version languish for months or years. Community contributions go unreviewed. The message is clear: if you want fixes, buy the commercial product.
We think there’s a better path.
Our Fix: Close Stale Reuseport Sockets
The fix is surgical. In ngx_close_listening_sockets(), instead of skipping all QUIC sockets, we only skip non-reuseport ones:
#if (NGX_QUIC)
if (ls[i].quic) {
- continue;
+ if (!ls[i].reuseport) {
+ continue;
+ }
+
+ /*
+ * Close QUIC reuseport sockets to remove the exiting worker
+ * from the reuseport group, preventing new QUIC connections
+ * from being routed to this worker during graceful shutdown.
+ */
}
#endif
That’s it. When a QUIC listening socket has reuseport enabled, it gets closed during graceful shutdown — just like every other listening socket. This removes the old worker’s socket from the kernel’s reuseport group immediately. All new QUIC Initial packets now route exclusively to new worker sockets.
Non-reuseport QUIC sockets continue to be kept open, preserving the ability for old workers to finish servicing existing QUIC connections.
Why this works:
- With
quic_bpf on, NGINX already setsls->ignore = 1, which means each worker creates its own fresh socket. Old and new workers don’t share sockets. - Existing QUIC connections on the old worker use CID-based routing through the BPF map. But the old worker’s BPF map entries were already invalidated when the new workers started with a fresh map. So the old worker’s reuseport socket isn’t serving existing connections either — it’s purely dead weight in the reuseport group.
- Closing it is not just safe; it’s the only correct behavior.
Before and After
We tested with 5 consecutive reloads, 20 HTTP/3 requests each:
| Scenario | Restart | Reload 1 | Reload 2 | Reload 3 | Reload 4 | Reload 5 |
|---|---|---|---|---|---|---|
| Unpatched | 20/20 | 10/20 | 9/20 | 11/20 | 10/20 | 10/20 |
| Patched (nginx-mod) | 20/20 | 20/20 | 20/20 | 20/20 | 20/20 | 20/20 |
100% success rate across all reloads with the patch. The fix is deterministic — not a timing improvement, but a complete elimination of the failure mode.
Get the Fix Now with nginx-mod
nginx-mod is a better NGINX: community-driven, actively patched, with fixes that upstream ignores. This QUIC reload fix ships in release 37.
Install on RHEL, CentOS, AlmaLinux, Rocky Linux, Fedora, or SUSE:
sudo dnf -y install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf -y install nginx-mod
Or if you’re upgrading from stock NGINX:
sudo dnf -y swap nginx nginx-mod
Then verify HTTP/3 works after reload:
sudo nginx -s reload
curl --http3-only -I https://your-domain.com
This isn’t the first upstream bug we’ve fixed. nginx-mod also patches the empty $http_host bug in HTTP/3 that upstream shipped a partial fix for months later.
Subscribe to the GetPageSpeed repository for ongoing access to nginx-mod and 1,000+ other RPM packages.
Workaround Without nginx-mod
If you can’t switch to nginx-mod, your options are limited:
Option 1: Disable quic_bpf
# Remove or comment out:
# quic_bpf on;
Without quic_bpf, NGINX doesn’t set ls->ignore = 1, so sockets are inherited normally during reload rather than being recreated. The kernel distributes QUIC packets across inherited sockets, and both old and new workers can handle them.
The downside: you lose CID-based BPF routing. Without it, QUIC packets for existing connections may land on the wrong worker after reload, breaking those connections. You’re trading “new connections fail” for “existing connections break.” Not a real solution — just a different flavor of broken.
Option 2: Use nginx -s stop && nginx instead of reload
A full restart avoids the bug entirely since there are no old workers with stale sockets. But you lose all active connections (HTTP/1.1, HTTP/2, and QUIC) during the restart window. Unacceptable for production.
Option 3: Apply the patch yourself
Download the patch and rebuild NGINX from source. If you’re already building from source, this is straightforward. If you’re using distro packages, this is a maintenance burden you probably don’t want.
Who Is Affected
You are affected if all of these are true:
- NGINX with QUIC/HTTP3 support (1.25.0+)
quic_bpf onin your configlisten ... quic reuseportin your configworker_processes> 1- You use
nginx -s reload(orsystemctl reload nginx)
If you’re running HTTP/3 in production on Linux, you almost certainly have all of these. The quic_bpf on directive is recommended in every QUIC deployment guide, and reuseport is required for multi-worker QUIC.
Wrapping Up
This bug silently drops production HTTP/3 traffic every time you reload NGINX. It’s been known and reported for over a year. Community fixes exist but go unreviewed. F5 has the resources to fix this in an afternoon — they choose not to.
nginx-mod exists because the open-source NGINX community deserves a build that actually ships fixes. Get it today.

