Site icon GetPageSpeed

Varnish GeoIP

Varnish

Varnish

Building a website that has GeoIP features is useful for many reasons. You can pre-select users’ currency, language, enforce access restrictions, etc. Most importantly, you can optimize your E-Commerce website conversions by large.

In this post, I’m going to cover the Varnish GeoIP 2 module, which allows you to extend your Varnish with GeoIP functions. And quite a bit more…

Varnish and GeoIP

How / where you implement GeoIP in your Varnish-powered web stack – mostly Vary-es, as all things Varnish 🙂

To serve different content for different geo-locations, while keeping URLs the same – means you want to vary your cache by some geo-parameter (country code). Let’s call this geo-variations. You mostly always want to normalize the geo-parameter’s value there, to ensure cache efficiency.

To present users from different locations with different URLs, means only that – redirect them to different URLs, depending on their location. Let’s refer to this as geo-redirects going further.

If you have an nginx sandwich with Varnish

This kind of setup means using nginx for TLS termination in front of Varnish, and your backend is (another?) nginx instance. Then, you can simply load nginx-module-geoip2 in your “TLS nginx” and configure it in this way:

This assumes that the TLS termination is configured within nginx’s http {...} context (HTTP proxy).

You can also do TLS termination in nginx using stream {...} as well, but that’s going to hurt you with no HTTP/2 support because with stream module, nginx is unable to negotiate ALPN protocols (as of yet).

If you use Hitch with Varnish

As was just mentioned, nginx TLS termination will result in only HTTP/1.1 when stream module is used, because it cannot negotiate ALPN protocol. Using TCP stream would be more efficient because it does not have to look inside HTTP data stream and unnecessarily inspect HTTP headers, and such.

Meet Hitch. It does not have this downside:

Since Hitch is good in TLS termination only and nothing more, this is the time when you’ll extend Varnish with GeoIP features!

In this setup, you would load GeoIP 2 VMOD and:

To be fair, you can also use Varnish GeoIP 2 VMOD with NGINX sandwich setup as well. Just because you can code quite sophisticated logic within VCL rather than nginx configuration.

So it is when you make use of TLS termination software like Hitch (which is not capable / should not be able to handle any of HTTP semantics), and you want to leverage geolocation data in your app – you absolutely want to empower your Varnish with GeoIP capabilities.

Install Varnish GeoIP 2 VMOD in CentOS/RHEL 6, 7, 8 or Amazon Linux 2

There are 2 GeoIP VMODs available at present: one that is using the now legacy .dat files format, by Varnish Software, and one with the support for newer, .mmdb files, by Federico G. Schwindt.

The .dat format is no longer receiving free data updates, so naturally, we want the VMOD with support for the newer format 🙂

So let’s get things rock and rolling by installing Varnish 6.0 LTS with everything we need.

The first step is to set up our repository:

sudo yum -y install https://extras.getpagespeed.com/release-latest.rpm

This repository will give access to a more recent GeoIP data update program, geoipupdate. It is capable of updating .mmdb files from MaxMind servers.

CentOS/RHEL 6, 7 default to Varnish 4 which is now End of Life. For these systems, enable the repository with Varnish 6.0 LTS and its modules:

sudo yum-config-manager --enable getpagespeed-extras-varnish60

A note about Varnish 6.0 LTS repository by GetPageSpeed

So I’ve built a little YUM repository … 🙂

With Varnish 6.0.x becoming de-facto Varnish’s LTS version, and the vast array of VMODs I wanted to try and use in production, building this repo was something I was itchy to do.

Big thanks go towards Ingvar Hagelund as his COPR repository and packaging efforts are at the base of building our own Varnish 6 LTS repository.

If you want a repository that includes many VMODs and is powered by CDN – by all means, use my repo! 😀

Done promoting my repository. Let’s proceed to install GeoIP stuff:

yum install varnish vmod-geoip2 geoipupdate-cron 

This will install:

GeoIP.conf

Ensure special configuration file for updating GeoIP databases is present – /etc/GeoIP.conf:

AccountID YOUR_ACCOUNT_ID_HERE
LicenseKey YOUR_LICENSE_KEY_HERE
EditionIDs GeoLite2-Country GeoLite2-City

If you’re using commercial databases, you will adjust the EditionIDs appropriately.

Now you can run geoipupdate once – this will do the initial download of the GeoIP databases. And voila, a few seconds later you’d have the files GeoLite2-City.mmdb and GeoLite2-Country.mmdb downloaded to your /usr/share/GeoIP/ directory.

The cron that we had installed earlier, will make sure that the database files are updated weekly.

Getting started with GeoIP 2 VMOD

What you do with geolocation depends on your application requirements. But let’s check the basics of how to initialize the GeoIP 2 VMOD. In your VCL file you need to initialize it with:

import geoip2;
sub vcl_init {
  new country = geoip2.geoip2("/usr/share/GeoIP/GeoLite2-Country.mmdb");
}

Then you will be able to make use of GeoIP data further in your VCL logic.

Geo-variations

sub vcl_recv {
    set req.http.X-Country-Code = country.lookup("country/iso_code", client.ip);
    ...
}

This would make Varnish send the X-Country-Code HTTP header to your backend.

For instance, in PHP you would be able to read it from $_SERVER["HTTP_X_COUNTRY_CODE"], with the country code of visitor.

Efficient geo-variations. Normalization

Simply creating different, highly customized GeoIP page content, for all countries in the world, is probably not a feasible task. If you blindly vary cache for every value of X-Country-Code, this will unnecessarily create duplicate cached data and reduce your cache hit-rate.

Suppose that we actually handcraft our pages to display differently for only 3 “target” countries: the United States, Russia and France (country codes US, RU, FR). For any other country, we want to present the US version. Knowing which countries we really vary cache for will allow us to partition cache efficiently, thus increasing cache hit-rate:

...
# additionally import std for `tolower` function
import std;
...
sub vcl_recv {
    set req.http.X-Country-Code = country.lookup("country/iso_code", client.ip);
    # Normalize country code to lower case
    set req.http.X-Country-Code = std.tolower(req.http.X-Country-Code);    
    if (req.http.X-Country-Code !~ "(us|ru|fr)") {
      set req.http.X-Country-Code = 'us';
    }
}

Vary wisely!

With geo-variations, we want different page content for different countries on the same URL. So of course, we have to teach our Varnish to partition cache by the country code.

There are 2 approaches here, and they depend on your needs.

Option 1. Hashing

First, there is hashing available, which will create multiple actual objects for each country code you want to vary page contents for. This makes it easy to target purging pages of specific countries. E.g. you have updated French variant of your page, and you want to clear only that variant. Then you can send X-Country-Code = fr while PURGE-ing it and only that variant would be cleared.

So, to be able to clear individual geo-variations easily, you may want to use hashing to partition your Varnish cache:

sub vcl_hash {
  ...
  hash_data(req.http.X-Country-Code);
  ...
}

Option 2. Vary header

A different approach to have multiple page variants on the same URL is to use Vary header. The big upside here would be one cached object per page, but with multiple variations in Varnish. It is easy to purge such an object in its entirety, that is with all its variants. Just PURGE it 🙂

So to recap. You don’t get to use both approaches at the same time. Only one: so “choose your destiny”. As each approach has its specifics:

OK, if I did not make myself clear yet, you really should use Vary always! 🙂 It provides for a flawless victory (“MK”): you can easily purge all variants (which may be cumbersome with hashing) or a specific variant.

The Vary approach goes down to this VCL:

# The backend creates content based on the normalized X-Country-Code:
sub vcl_backend_response {
    if (bereq.http.X-Country-Code) {
        if (!beresp.http.Vary) { # no Vary at all
            set beresp.http.Vary = "X-Country-Code";
        } elsif (beresp.http.Vary !~ "X-Country-Code") { # add to existing Vary
            set beresp.http.Vary = beresp.http.Vary + ", X-Country-Code";
        }
    }
}

And our complete VCL file may look like:

vcl 4.1;

import std;
import geoip2;

backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

acl purgers { "127.0.0.1"; }

sub vcl_init {
  new country = geoip2.geoip2("/usr/share/GeoIP/GeoLite2-Country.mmdb");
}

sub vcl_recv {
       if (req.method == "PURGE") {
          if (!client.ip ~ purgers) {
             return (synth(405, "Purging not allowed for " + client.ip));
          }
          # Our app supplied X-Country-Code to indicate clearing of specific geo-variation
          if (req.http.X-Country-Code) {
             set req.method = "GET";
             set req.hash_always_miss = true;
          } else {
             # clear all geo-variants of this page
             return (purge);
          }
       } else {
          set req.http.X-Country-Code = country.lookup("country/iso_code", client.ip);
          # Normalize country code to lower case
          set req.http.X-Country-Code = std.tolower(req.http.X-Country-Code);    
          if (req.http.X-Country-Code !~ "(us|ru|fr)") {
              set req.http.X-Country-Code = "us";
          }
       }
}
# The backend creates content based on the normalized X-Country-Code:
sub vcl_backend_response {
    if (bereq.http.X-Country-Code) {
        if (!beresp.http.Vary) { # no Vary at all
            set beresp.http.Vary = "X-Country-Code";
        } elsif (beresp.http.Vary !~ "X-Country-Code") { # add to existing Vary
            set beresp.http.Vary = beresp.http.Vary + ", X-Country-Code";
        }
    }
}

So when we receive a PURGE request, we first check whether our app supplied the country code.
If it did, then we purge cache for just that country, using req.hash_always_miss, otherwise using (purge) will purge all country variations.

And if it’s not a PURGE request, this is when we use the GeoIP 2 VMOD to set country code for use in our backend.

P.S. Another geo-related use case involving Varnish for geo-variants is with a CDN of Varnish servers. Considering you do know the geo-location of each of your Varnish CDN edge server, you can have them each configured to send the proper X-Country-Code. Naturally, you would not need any GeoIP VMOD for that because when traffic reaches a particular Varnish server instance – it was already GeoIP – directed (by something like Route53 DNS).

Exit mobile version