fbpx

Varnish

Varnish – cache on cookies

by , , revisited on


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

In an ongoing process of learning Varnish, I’ve stumbled upon this topic now and then. The default behavior of Varnish is to not deliver cached pages for requests with cookies and not cache pages that have Set-Cookie in backend response.

The standard approach to leverage Varnish with a PHP app is to strip all cookies but the ones that are absolutely necessary. When a client sends a request for a page with an essential app cookie (e.g. logged in user) – the page is delivered uncached. Other times (e.g. guest user) the page is delivered from the cache.

With this approach, we are surely missing out on cache for logged in users (or other cases where users should be presented with different content, for example, language or timezone).

If we want Varnish to cache those pages as well, we need a few bits of VCL to make things right 🙂

This is a typical case where we have a cookie that represents a session ID of some kind, or a language/currency preference.
Caching on such cookies allows for caching user-specific content.

Suppose that we have some pages which receive requests with ‘Cookie: mycookie=`, and we want to cache the pages for each cookie value individually.

First thing to account for, is that the default builtin.vcl does not allow a request with Cookie header to be delivered from cache:

if (req.http.Authorization || req.http.Cookie) {
    /* Not cacheable by default */
    return (pass);
}

It goes straight to the backend. We want to change that. In your own VCL, you should have a return statement. Its presence will ensure that the builtin.vcl logic for this procedure will not be run:

In your own vcl_recv, put:

# ....
if (req.http.Authorization) {
    /* Not cacheable by default */
    return (pass);
}

return(hash);

Now the second thing we should do is adjust or add the vcl_hash procedure to tell Varnish that cache for a page should be different based on the value of the Cookie that we want to cache with.

sub vcl_hash {
  if (req.http.cookie ~ "mycookie=") {
    set req.http.X-TMP = regsub(req.http.cookie, "^.*?mycookie=([^;]+);*.*$", "\1")
    hash_data(req.http.X-TMP);
    remove req.http.X-TMP;
  }
  # the builtin.vcl will take care of also varying cache on Host/IP and URL 
}

What does it do?

  • If mycookie is present in request headers, then we create X-TMP for internal use by Varnish.
  • The value of the X-TMP header will be the value of mycookie header.
  • Then we tell Varnish that the cache should vary based on the value found X-TMP: hash_data(req.http.X-TMP);
  • Since we no longer need this internal header, we remove it: remove req.http.X-TMP;

The result is that different values of of mycookie will be cached separately, and if the backend emits different content based on the cookie value – we cache those variations efficiently.

But what if we have a lot of such cookies? Stuffing VCL with lengthy regular expression is neither reable nor a clean thing to do.

Varnish Cache is easily extendable with modules (VMODs). One module that allows you to deal with cookies efficiently, is, as you’ve guessed, the cookie VMOD.

For bleeding edge Varnish versions there is no installation required, it is part of the Varnish core.

For Varnish 4.x and 6.0.x LTS, it is available via varnish-modules package.

Our commercial repository has got you covered.

For CentOS/RHEL 6 or 7 (Varnish 4.x is default); CentOS/RHEL 8 or Amazon Linux 2 (Varnish 6.0.x is default):

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm
sudo yum -y install varnish-modules

If you want to use Varnish 6.0.x LTS with its module packages on CentOS/RHEL 6 or 7, you should run the following instead:

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm
sudo yum install yum-utils
sudo yum-config-manager --enable getpagespeed-extras-varnish60
sudo yum install varnish-modules

Caching on multiple cookies

Now let’s extend our example from earlier and introduce another cookie named mycookie2.
Our backend generates different pages for values of mycookie and mycookie2.
(let’s say mycookie2 is language preference, while mycookie is session ID).

The suggested approach from the mailing list (useful Varnish resource) is to use cookie vmod :

I highly recommend using vmod cookie to avoid the regex madness. I’d also extract the cookies into their headers and hash them unconditionally

Using the cookie VMOD, it is easy to cache on both cookies:

import cookie;

sub vcl_recv {
    cookie.parse(req.http.cookie);
    set req.http.mycookie = cookie.get("mycookie");
    set req.http.mycookie2 = cookie.get("mycookie2");
    unset req.http.cookie;
}

sub vcl_hash {
    hash_data(req.http.mycookie);
    hash_data(req.http.mycookie2);
}

Now different cookie values are cached separately. Of course note, that the more cookies you cache on, the more severe your cache is partitioned, and subsequently, the worse your cache hit-ratio would be.
So if you have to cache on many cookies, but not every page is actually different based on their value, you might want to add conditional logic for URL checks:

import cookie;

sub vcl_recv {
    if (req.url ~ "/area/where/these/cookies/matter/") {
        cookie.parse(req.http.cookie);
        set req.http.mycookie = cookie.get("mycookie");
        set req.http.mycookie2 = cookie.get("mycookie2");
        unset req.http.cookie;
    }
}

sub vcl_hash {
    if (req.http.mycookie) {
        hash_data(req.http.mycookie);
    }
    if (req.http.mycookie2) {
        hash_data(req.http.mycookie2);
    }
}

Suggested read:

  • Varnish Cache vs Cookies, part 1 mentions the recommended way to cache despite any cookies being present (Warning: this is applicable only for sane backends. WordPress is not one of them).
    The requirement for such configuration is a backend which would send proper caching headers in case of user-specific content, e.g. Vary:, the user auth is headers/token based, or there is no sensitive content to begin with. This is entirely different than “caching on cookies” above which allows cache partitioning.
  1. sina

    Hi, I have the same problem. I want to cache the all pages of my web site except the cookies. I want to have a fresh PHPSESSID and other user defined cookies while the request is responded from cache. for example the fisrt PHPSESSID=ev4vfmf0iukl9j0sn509bvuv7 and if I clean the cookies in my browser I get the fresh value for PHPSESSID. I did as you said in this article:

       if (req.http.cookie ~ "PHPSESSID=") {
        set req.http.X-SESSION = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1=");
        hash_data(req.http.X-SESSION);
        unset req.http.X-SESSION;
      }
    

    but this has not resolve my problem. I still cannot see the PHPSESSID in response header in Chrome browser.

    Reply
    • Danila Vershinin

      If you cannot see the PHPSESSID in HTTP response headers, this only means that you have extra VCL code which unsets the cookie when your server sends it.
      Obviously, that code has to be removed.

      Reply
  2. sina

    I changed my VCL config. now I can see the PHPSESSID in Request headers in chrome. it seems it is working like I expect. When I remove cookies I can get the new value for PHPSESSID. But I still have two problems the first is that when I remove the cookies from browser, I have to refresh the page at least 3 times to get the page from cache! Another issue is that I want to have the values of four more cookies but with my VCL code I can only see the PHPSESSID! Here’s my VCL code:

    sub vcl_recv {
     set req.backend_hint = apache.backend();
    
    if (req.http.Cookie)
       {
        set req.http.Cookie = ";" + req.http.Cookie;
        set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
        set req.http.Cookie = regsuball(req.http.Cookie, ";(used_ads|used_doctors|used_natives|PHPSESSID|csrf)=", "; \1=");
        set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
        set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
    
        if (req.http.Cookie == "") {
            unset req.http.Cookie;
        }
    
    }
    
    if ((req.url ~ "admin-ajax.php") && !req.http.cookie ~ "wordpress_logged_in" ) {
          return (hash);
      }
    
    if (req.http.X-Requested-With == "XMLHttpRequest") {
            return(pass);
        }
    
    if (req.method != "GET" && req.method != "HEAD") {
          return (pass);
      }
    
    if (req.url ~ "(wp-admin|post.php|edit.php|wp-login|forms)") {
            return(pass);
      }
      if (req.url ~ "/wp-cron.php" || req.url ~ "preview=true") {
            return (pass);
      }
    
    if (req.http.Authorization) {
            return(pass);
      }
    
    if (req.http.Accept-Encoding)
      {
        if (req.url ~ ".(png|jpg|jpeg|gz|tgz|bz2|tbz|mp3|ogg|swf|flv)$")
            {
                    unset req.http.Accept-Encoding;
            }
            elsif (req.http.Accept-Encoding ~ "gzip")
            {
                    set req.http.Accept-Encoding = "gzip";
            }
            elsif (req.http.Accept-Encoding ~ "deflate" && req.http.user-agent !~ "MSIE")
            {
                    set req.http.Accept-Encoding = "deflate";
            }
            else
            {
                    unset req.http.Accept-Encoding;
            }
      }
    
        set req.http.X-Forwarded-For = req.http.X-Forwarded-For;
    
    
        unset req.http.Accept-Language;
        unset req.http.User-Agent;
    
        set req.http.cookie = regsuball(req.http.cookie, "wp-settings-\d+=[^;]+(; )?", "");
        set req.http.cookie = regsuball(req.http.cookie, "wp-settings-time-\d+=[^;]+(; )?", "");
        set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
    
        #cache everything left behind
        return(hash);
    
    }
    
    sub vcl_hash
    {
    
    if (req.http.cookie ~ "(used_ads|used_doctors|used_natives|PHPSESSID|csrf)") {
        set req.http.X-TMP = regsuball(req.http.Cookie, ";(used_ads|used_doctors|used_natives|PHPSESSID|csrf)=", "; \1=");
        hash_data(req.http.X-TMP);
        unset req.http.X-TMP;
      }
    
    }
    
    sub vcl_backend_response {
            if ((bereq.url ~ "admin-ajax.php") && !bereq.http.cookie ~ "wordpress_logged_in" ) {
                unset beresp.http.set-cookie;
                set beresp.ttl = 12h;
             }
    
        if (!(bereq.url ~ "wp-(login|admin)|login|admin-ajax.php|forms"))
    
        {
                set beresp.ttl = 6h;
        }
    
    
        set beresp.grace = 2h;
    
    
    if (beresp.http.Location == "https://" + bereq.http.host + bereq.url)
    {
       if (bereq.retries > 1)
       {
          unset beresp.http.Location;
       }
       else
       {
          return (retry);
       }
    }
    
    }
    
    sub vcl_deliver {
    
    if (obj.hits > 0)
    {
           set resp.http.X-Status = "1";
    }
    else
    {
           set resp.http.X-Status = "0";
    }
    
    unset resp.http.X-Varnish;
     unset resp.http.Via;
     unset resp.http.X-Powered-By;
     unset resp.http.Server;
    
    }
    
    Reply
  3. Danila Vershinin

    If your app is WordPress, you should rather not cache at all in presence of WordPress specific cookies.
    You can cache user session though, but that means you should also develop the code to talk to Varnish and invalidate user-cache in Varnish when something changes for particular user, or just use very short TTL.

    Also, PHPSESSID is a regular PHP cookie name, so that means one of the plugins is not following on WordPress conventions. It is best to get rid of those.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.