Skip to main content

NGINX

NGINX Microcaching: Varnish-Style Full-Page Caching Without a Third-Party Module

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

Your WordPress homepage takes 600 to 900 ms to build. Your API endpoint runs the same database query a thousand times a second. The pages barely change from one second to the next, yet every visitor pays the full backend cost. The usual answer is to put Varnish in front of NGINX: a second daemon, a second port, its own configuration language, its own process to babysit.

You do not need it. NGINX core already does Varnish-style full-page caching, and it does it well. A one-second cache turns a page that is rebuilt on every hit into one that is rebuilt roughly once per second, with everyone else served a copy straight out of RAM. That technique is called nginx microcaching, and this article shows you two ways to run it: the core proxy_cache recipe that needs zero extra modules, and a full-RAM variant built on open modules GetPageSpeed already packages.

What microcaching actually is

Microcaching is full-page caching with a deliberately tiny time-to-live, usually one second. That sounds pointless until you do the arithmetic.

If a page is requested 5,000 times a second and you cache it for one second, the backend is hit at most once per second instead of 5,000 times. That is a 5000-to-1 reduction in backend load, and the cached copy is at most one second stale. For a news homepage, a product listing, or a JSON API that aggregates slow upstream calls, one second of staleness is invisible to users and transformative for your servers.

The two things people fear about caching dynamic content both have clean answers in NGINX core:

  • “The cache will go stale during a traffic spike.” It will not, because the copy is never more than your TTL old, and the refresh happens in the background while visitors keep getting instant responses.
  • “A cache miss will stampede my backend.” It will not, because NGINX can collapse simultaneous misses for the same page into a single upstream request.

Both of those are one directive each. Here is the whole thing.

The core recipe: proxy_cache on tmpfs

This needs no modules at all. It is built entirely from directives that ship in stock NGINX and are documented at nginx.org.

First, declare a cache zone. The important detail is the path: /dev/shm is tmpfs, a RAM-backed filesystem, so the cache lives in memory with no physical disk I/O.

# in the http { } context
proxy_cache_path /dev/shm/nginx-micro levels=1:2 keys_zone=micro:10m
                 max_size=256m inactive=10s use_temp_path=off;

Then turn it on for a location:

server {
    listen 80;
    server_name _;

    location / {
        proxy_cache micro;
        proxy_cache_valid 200 301 302 1s;      # the micro-TTL: one second
        proxy_cache_use_stale updating error timeout
                              http_500 http_502 http_503 http_504;
        proxy_cache_background_update on;       # refresh in the background
        proxy_cache_lock on;                    # collapse the stampede on a miss

        add_header X-Cache-Status $upstream_cache_status;

        proxy_pass http://127.0.0.1:8080;
    }
}

That is the entire microcache. Every line is core NGINX. What each part buys you:

  • proxy_cache_path ... /dev/shm keeps the cache in RAM. There is no disk seek on a hit. Even if you point it at a normal directory, the Linux page cache keeps hot entries in memory anyway, but tmpfs makes it explicit and bounded.
  • proxy_cache_valid ... 1s is the micro-TTL. A cached entry is considered fresh for one second.
  • proxy_cache_use_stale updating is the heart of stale-while-revalidate. When an entry is being refreshed, every other request keeps getting the stale copy instantly instead of waiting. The error timeout http_50x parameters add stale-if-error: if the backend falls over, visitors keep seeing the last good page instead of a 502.
  • proxy_cache_background_update on (NGINX 1.11.10 and later, so every current release) does the refresh in a background subrequest, so no single visitor ever eats the rebuild latency.
  • proxy_cache_lock on (NGINX 1.1.12 and later) is request coalescing. When ten requests for an uncached page arrive at once, one goes to the backend and the other nine wait for it to fill the cache, instead of all ten stampeding your origin.
  • $upstream_cache_status is a built-in variable that reports MISS, HIT, EXPIRED, STALE, UPDATING, REVALIDATED, or BYPASS. Exposing it as a header makes the cache observable from the outside.

How stale-while-revalidate works here

Walk through the life of a hot page under this config:

  1. First request: nothing is cached. X-Cache-Status: MISS. NGINX fetches from the backend, stores the response in the tmpfs zone, and serves it.
  2. Next requests, within one second: X-Cache-Status: HIT. Served straight from RAM. The backend is not touched at all.
  3. One second later: the entry is now stale. The first request to arrive triggers a background refresh and is itself served the stale copy immediately. You will see X-Cache-Status: UPDATING or STALE on that request, not a pause.
  4. While that refresh is in flight: everyone else also gets the stale copy instantly, because of proxy_cache_use_stale updating. Nobody waits on the backend.
  5. Once the refresh completes: the cycle resets to step 2 with fresh content.

The net effect is that your backend serves roughly one request per second per cached page, no matter how much traffic hits the edge, and no user ever waits for a rebuild. That is exactly the behavior people install a separate caching daemon to get.

Verifying it works

Hit the same URL twice and watch the status header:

curl -sI http://localhost/ | grep -i x-cache-status
# X-Cache-Status: MISS
curl -sI http://localhost/ | grep -i x-cache-status
# X-Cache-Status: HIT

To see the backend offload under load, run a quick benchmark and watch how few requests reach your origin. With a one-second microcache in front, a backend that handled a few thousand requests per second directly will see its request count drop to a trickle while the edge serves hundreds of thousands of cached responses per second from RAM.

You can also confirm the cache really lives in memory:

ls -R /dev/shm/nginx-micro

The cache files appear under tmpfs, never touching your disk.

Skipping the cache for logged-in users

Full-page caching and personalized pages have to coexist. The standard pattern is to cache anonymous traffic and bypass the cache entirely for anyone carrying a session cookie. Core NGINX does this with a map and two directives.

# in the http { } context
map $http_cookie $micro_bypass {
    default                 0;
    ~*wordpress_logged_in   1;
    ~*comment_author        1;
    ~*PHPSESSID             1;
}
# inside the cached location { }
proxy_cache_bypass $micro_bypass;   # don't serve a cached copy to logged-in users
proxy_no_cache     $micro_bypass;   # and don't store their personalized responses

proxy_cache_bypass makes the request skip the cache lookup (you will see X-Cache-Status: BYPASS), and proxy_no_cache keeps that user’s personalized response from ever entering the cache. Anonymous visitors get the fast cached path; logged-in users get fresh, private pages.

Full-RAM page caching with srcache and memcached

The proxy_cache recipe above is already RAM-resident on tmpfs and is all most sites need. But it caches per worker process and per host. If you want whole rendered pages held in a shared in-memory store, reachable by every NGINX worker and even shared across a fleet of edge nodes, the open-source srcache module stores and fetches full responses from Memcached or Redis.

GetPageSpeed packages every piece. The wiring is an internal location that talks to Memcached, plus two directives on your real location:

# in the http { } context
upstream memcached_pool {
    server 127.0.0.1:11211;
    keepalive 32;
}

server {
    listen 80;
    server_name _;

    # internal endpoint srcache uses to read and write the cache
    location /memc {
        internal;
        set $memc_key $query_string;
        set $memc_exptime 300;          # seconds a rendered page lives in RAM
        memc_pass memcached_pool;
    }

    location / {
        set $cache_key $request_uri;
        srcache_fetch GET /memc $cache_key;
        srcache_store PUT /memc $cache_key;

        add_header X-SRCache-Fetch $srcache_fetch_status;
        add_header X-SRCache-Store $srcache_store_status;

        proxy_pass http://127.0.0.1:8080;
    }
}

On the first request you will see X-SRCache-Store: STORE as the rendered page is written into Memcached; on the next, X-SRCache-Fetch: HIT as it is served straight from RAM. Because the store is Memcached, every NGINX worker shares one cache, and pointing several edge nodes at the same Memcached (or Redis, via the redis2 module) gives you a cache shared across the whole fleet. The dedicated srcache guide walks through the Redis variant and key-naming in depth.

Installing the modules

The core microcache from the first half of this article needs nothing installed: proxy_cache, proxy_cache_use_stale, proxy_cache_lock, and friends are part of stock NGINX. Only the srcache plus Memcached recipe needs modules.

On RHEL, CentOS, AlmaLinux, Rocky Linux, and Amazon Linux:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-srcache nginx-module-memc nginx-module-redis2

Load them near the top of /etc/nginx/nginx.conf:

load_module modules/ngx_http_srcache_filter_module.so;
load_module modules/ngx_http_memc_module.so;
load_module modules/ngx_http_redis2_module.so;

On Debian and Ubuntu, set up the GetPageSpeed APT repository, then:

sudo apt-get update
sudo apt-get install nginx-module-srcache nginx-module-memc nginx-module-redis2

On Debian and Ubuntu the packages handle module loading automatically; no load_module line is needed.

Each of these is a small, single-purpose, open-source module that does one thing. That is worth keeping in mind when a vendor offers a large binary page-cache filter for this job: everything above is core NGINX or an auditable module that the NGINX and OpenResty communities have maintained for years, so weigh what a big third-party filter adds to the process that terminates your TLS against directives nginx already documents and supports.

For package details, see the module pages for srcache, memc, and redis2.

When you would still reach for Varnish

Microcaching on NGINX is not a total replacement for a dedicated cache in every scenario, and it is worth being honest about that. Reach for Varnish or a similar dedicated cache when you need:

  • Edge Side Includes (ESI) to assemble a page from independently cached fragments.
  • Complex request-time cache logic that is easier to express in VCL than in NGINX directives and maps.
  • A rich invalidation API, such as banning or purging large groups of objects by tag or pattern across a cluster. For straightforward key-based purging, the cache-purge module covers most needs.
  • Very large on-disk caches with sophisticated eviction, beyond what you want to keep in RAM.

For the common case, serving mostly-anonymous pages fast and shielding a slow backend from repeated work, the core recipe wins on simplicity: one daemon, one config, no extra moving parts.

Conclusion

Varnish-style full-page caching is not something you have to bolt onto NGINX from the outside. The core proxy_cache engine, pointed at tmpfs and combined with proxy_cache_use_stale updating, proxy_cache_background_update, and proxy_cache_lock, gives you nginx microcaching with stale-while-revalidate and request coalescing built in, with zero extra modules and every directive documented and maintained upstream. When you want a shared, full-RAM page cache across workers or hosts, the open srcache plus Memcached or Redis stack extends the same idea without adding an unauditable component to your edge.

You already have everything you need. Point a one-second cache at /dev/shm, watch X-Cache-Status flip to HIT, and let your backend breathe.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.