Site icon GetPageSpeed

Varnish – cache on cookies

Varnish

Varnish

In an ongoing process of learning Varnish, I’ve stumbled upon this topic now and then. The default behavior of Varnish is to not deliver cached pages for requests with cookies and not cache pages that have Set-Cookie in backend response.

The standard approach to leverage Varnish with a PHP app is to strip all cookies but the ones that are absolutely necessary. When a client sends a request for a page with an essential app cookie (e.g. logged in user) – the page is delivered uncached. Other times (e.g. guest user) the page is delivered from the cache.

With this approach, we are surely missing out on cache for logged in users (or other cases where users should be presented with different content, for example, language or timezone).

If we want Varnish to cache those pages as well, we need a few bits of VCL to make things right 🙂

This is a typical case where we have a cookie that represents a session ID of some kind, or a language/currency preference.
Caching on such cookies allows for caching user-specific content.

Suppose that we have some pages which receive requests with ‘Cookie: mycookie=`, and we want to cache the pages for each cookie value individually.

First thing to account for, is that the default builtin.vcl does not allow a request with Cookie header to be delivered from cache:

if (req.http.Authorization || req.http.Cookie) {
    /* Not cacheable by default */
    return (pass);
}

It goes straight to the backend. We want to change that. In your own VCL, you should have a return statement. Its presence will ensure that the builtin.vcl logic for this procedure will not be run:

In your own vcl_recv, put:

# ....
if (req.http.Authorization) {
    /* Not cacheable by default */
    return (pass);
}

return(hash);

Now the second thing we should do is adjust or add the vcl_hash procedure to tell Varnish that cache for a page should be different based on the value of the Cookie that we want to cache with.

sub vcl_hash {
  if (req.http.cookie ~ "mycookie=") {
    set req.http.X-TMP = regsub(req.http.cookie, "^.*?mycookie=([^;]+);*.*$", "\1")
    hash_data(req.http.X-TMP);
    remove req.http.X-TMP;
  }
  # the builtin.vcl will take care of also varying cache on Host/IP and URL 
}

What does it do?

The result is that different values of of mycookie will be cached separately, and if the backend emits different content based on the cookie value – we cache those variations efficiently.

But what if we have a lot of such cookies? Stuffing VCL with lengthy regular expression is neither reable nor a clean thing to do.

Varnish Cache is easily extendable with modules (VMODs). One module that allows you to deal with cookies efficiently, is, as you’ve guessed, the cookie VMOD.

For bleeding edge Varnish versions there is no installation required, it is part of the Varnish core.

For Varnish 4.x and 6.0.x LTS, it is available via varnish-modules package.

Our commercial repository has got you covered.

For CentOS/RHEL 6 or 7 (Varnish 4.x is default); CentOS/RHEL 8 or Amazon Linux 2 (Varnish 6.0.x is default):

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm
sudo yum -y install varnish-modules

If you want to use Varnish 6.0.x LTS with its module packages on CentOS/RHEL 6 or 7, you should run the following instead:

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm
sudo yum install yum-utils
sudo yum-config-manager --enable getpagespeed-extras-varnish60
sudo yum install varnish-modules

Caching on multiple cookies

Now let’s extend our example from earlier and introduce another cookie named mycookie2.
Our backend generates different pages for values of mycookie and mycookie2.
(let’s say mycookie2 is language preference, while mycookie is session ID).

The suggested approach from the mailing list (useful Varnish resource) is to use cookie vmod :

I highly recommend using vmod cookie to avoid the regex madness. I’d also extract the cookies into their headers and hash them unconditionally

Using the cookie VMOD, it is easy to cache on both cookies:

import cookie;

sub vcl_recv {
    cookie.parse(req.http.cookie);
    set req.http.mycookie = cookie.get("mycookie");
    set req.http.mycookie2 = cookie.get("mycookie2");
    unset req.http.cookie;
}

sub vcl_hash {
    hash_data(req.http.mycookie);
    hash_data(req.http.mycookie2);
}

Now different cookie values are cached separately. Of course note, that the more cookies you cache on, the more severe your cache is partitioned, and subsequently, the worse your cache hit-ratio would be.
So if you have to cache on many cookies, but not every page is actually different based on their value, you might want to add conditional logic for URL checks:

import cookie;

sub vcl_recv {
    if (req.url ~ "/area/where/these/cookies/matter/") {
        cookie.parse(req.http.cookie);
        set req.http.mycookie = cookie.get("mycookie");
        set req.http.mycookie2 = cookie.get("mycookie2");
        unset req.http.cookie;
    }
}

sub vcl_hash {
    if (req.http.mycookie) {
        hash_data(req.http.mycookie);
    }
    if (req.http.mycookie2) {
        hash_data(req.http.mycookie2);
    }
}

Suggested read:

Exit mobile version