Site icon GetPageSpeed

Varnish Virtual Hosts. The Right Way

Varnish

Varnish

Ever seen this snippet below for Varnish virtual hosts and wondered how you’re going to manage a dozen of websites with the same dozen of if statements in your VCL file?

if (! req.http.Host) {
   error 404 "Need a host header";
}
set req.http.Host = regsub(req.http.Host, "^www\.", "");
set req.http.Host = regsub(req.http.Host, ":80$", "");

if (req.http.Host == "something.com") {
    include "/etc/varnish/site-something.com.vcl";
} elsif (req.http.Host == "somethingelse.com") {
   include "/etc/varnish/site-somethingelse.com.vcl";
}

While Varnish is so fine and great, it really lacks some documentation and tutorials on setting up virtual hosts the right way.

Varnish Virtual Hosts

Why do we need virtual hosts in Varnish so much? It’s a caching server. It doesn’t care for the domain name that is present in a request. It simply passes a request along to the backend server, or, if it’s present in Varnish cache, serves it directly without talking to Nginx or Apache.

But we need virtual hosts in Varnish. Because different sites use different technologies, different login pages, and so most importantly, they use different cookie names. Cookies are the primary reason the need for Varnish virtual hosts exists. So that we can filter against different cookies.

In general, we need Varnish to distinguish between the sites to adjust its caching policy towards specific website.

There is no built-in way and likely would never be. However, having the understanding of how the VCL works, you can manage to define your virtual hosts very similar to the way you love to do it in Nginx: through sites-available and sites-enabled directories. So let’s go.

How Varnish VCL works

Before we proceed to implementing Varnish virtual hosts, let’s review the most important thing about VCL – how include files work.

When you land with your new Varnish installation, you start coding from default.vcl. However, you have to realize one thing. There is another file with very base default VCL rules which Varnish has internally, let’s call it builtin.vcl. After executing routines in our default.vcl, Varnish will append routines from builtin.vcl making those run after the ones in our VCL file.

The two files may have the same routines, i.e. vcl_recv in both files, and these routines would both run on every request. In this order:

So the same routine, defined in last included file, will stack up and be called last.
If we include another file, say my.vcl and define vcl_recv in there, Varnish will run it in this order:

  1. vcl_recv from default.vcl
  2. vcl_recv from my.vcl
  3. vcl_recv from builtin.vcl

How is this multiple files inclusion any useful?

To make things flexible, Varnish would not call routines from included file, if you put return(...) statement in procedure of the current file.

It means that we can prevent Varnish default behavior (found in builtin VCL) by running specific logic on the same routine, and we can extend things further using include files.

So if vcl_recv had return(...) in default.vcl, then Varnish would only run:

  1. vcl_recv from default.vcl

Varnish Virtual Hosts strategy

So here’s the strategy we should start with when we code our VCL for multiple hosts. Let’s review on that same routine vcl_recv, which is most important, since it commonly have rules for filtering cookies or setting backend hints.

We assume you’re using CentOS/RHEL based paths, you can adjust accordingly for Debian derived systems.

First, create a directory holding your virtual hosts:

mkdir /etc/varnish/sites-enabled

Suppose we have a site a.example.com, it’s a WordPress blog with comments disabled. We want to have it ignore all the cookies except for the /wp-admin. Let’s create virtual host file.

nano /etc/varnish/sites-enabled/a.example.com.vcl

And paste in:

sub vcl_recv {
       if (req.http.host == "a.example.com") {
           # ignore all cookies on a WP site without comments (except for admin areas)
           if (req.url !~ "^/wp-(login|admin)") {
               unset req.http.cookie;
           }
       }
}

Now, another website of ours, b.example.com is so much different. It’s a Trac ticketing website and it runs using standalone Python app on a different port!

nano /etc/varnish/sites-enabled/b.example.com.vcl

And paste in:

backend trac {
    .host = "127.0.0.1";
    .port = "3050";
}

sub vcl_recv {
       if (req.http.host == "b.example.com") {
           set req.backend_hint = trac;
       }
}

Another website of ours, has WordPress with Woocommerce plugin. We don’t want to cache Woocommerce pages there. So we run:

nano /etc/varnish/sites-enabled/c.example.com.vcl

And paste in:


sub vcl_recv {
    if (req.http.host == "c.example.com") {
        if (req.url ~ "/(cart|my-account|checkout|addons|/?add-to-cart=)") {
        return (pass);
        }
    }
}

For every website, we use Google Analytics tracking. So let’s create handling for all the hosts in the file /etc/varnish/catch-all.vcl with the following:

sub vcl_recv {
        set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
        set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", "");
}

Next, we want to put everything together.

Update default.vcl in the following way:


vcl 4.0;
...
sub vcl_recv {
        ....
        # Normalize the header, remove the www and port 
        set req.http.host = regsub(req.http.host, "^www\.", "");
        set req.http.host = regsub(req.http.host, ":[0-9]+", "");

}
...
# at the very bottom:
include "all-vhosts.vcl";
include "catch-all.vcl";

Create all-vhosts.vcl file. It should contain:

include "sites-enabled/a.example.com.vcl";
include "sites-enabled/b.example.com.vcl";
include "sites-enabled/c.example.com.vcl";

Now we can reload Varnish by running service varnish reload. Varnish will handle different websites in specific way. Our main VCL file will not be abused by dozens of if statements and we can always disable special handling by commenting an include from all-vhosts.vcl file and reloading again.

The basic rules of placing VCL logic this way are the following:

You can start with the following sample configuration. Feel free to fork or send pull requests.

Exit mobile version