Skip to main content

Wordpress

The Jetpack Cache Problem: How It Destroys Your CDN

by , , revisited on


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

Every WordPress site running Jetpack is silently hemorrhaging cache efficiency. Your CDN, your Varnish, your edge cache — all of them are storing duplicate entries for the same content, serving cache misses where there should be hits.

The culprit? A single line of PHP that violates HTTP standards and was never caught by Automattic’s code review.

The Discovery

While investigating HTTP response headers on a WordPress site, I noticed something peculiar:

Vary: Accept-Encoding, accept, content-type, Accept-Encoding

Two problems immediately stand out:

  1. Duplicate Accept-Encoding — indicating multiple layers adding the same header
  2. content-type in a Vary header — which is semantically wrong

The Vary header tells caches which request headers affect the response content. Different request headers = different cache entries.

But Content-Type is a response header, not a request header. Including it in Vary violates RFC 7231 Section 7.1.4:

The “Vary” header field in a response describes what parts of a request message, other than the method, Host header field, and request target, might influence the origin server’s process for selecting and representing this response.

This is HTTP 101. Any junior developer who’s read the spec would catch this. Any AI assistant would flag it immediately:

“Including Content-Type in a Vary header is incorrect. Content-Type is a response header that describes the format of the response body. The Vary header should only reference request headers like Accept, Accept-Encoding, or Accept-Language.”

Yet this code shipped in Jetpack, was reviewed by Automattic engineers, and has been running on millions of WordPress sites since May 2023.

The Offending Code

In jetpack/jetpack_vendor/automattic/jetpack-status/src/class-request.php, line 83:

$vary_header_parts = array( 'accept', 'content-type' );

This function is called from over 20 places throughout Jetpack — Form blocks, Related Posts, Likes, Sharing buttons, Podcast Player — via the Request::is_frontend() method.

Every single frontend page load on a Jetpack-enabled site gets Vary: accept, content-type injected into its response headers. This Jetpack behavior destroys cache efficiency on millions of sites.

Why This Exists: The ActivityPub Saga

The Vary headers were added to address a legitimate problem: cache pollution between HTML and JSON responses.

When WordPress sites enable ActivityPub (the protocol powering Mastodon and the Fediverse), the same URL must serve different content based on the Accept header:

  • Browsers send Accept: text/html → get HTML page
  • Mastodon servers send Accept: application/activity+json → get JSON-LD

Without proper cache handling, a cached HTML response could be served to Mastodon (broken federation), or cached JSON could be served to browsers (broken website).

The fix? Tell caches to store separate entries based on the Accept header:

Vary: Accept

Simple, correct, and well-established HTTP semantics.

Jetpack’s Implementation: Maximum Incompetence

Instead of implementing this correctly, Jetpack’s developers did the following:

Mistake #1: Added a Response Header to Vary

$vary_header_parts = array( 'accept', 'content-type' );

The content-type here provides zero caching benefit and demonstrates a fundamental misunderstanding of HTTP. It’s cargo cult programming — “I’ve seen these headers together somewhere, so they must belong together.”

Mistake #2: Vary on Full Accept Header

The accept header seems correct, but it’s implemented poorly. Different browsers send different Accept headers:

Chrome:  text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8
Firefox: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Safari:  text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
curl:    */*

With Vary: Accept, your Varnish/CDN cache stores separate entries for each browser — even though all of them want HTML!

Estimated cache hit rate reduction: 30-50% on pages that should be highly cacheable.

The correct approach would be to not use Vary: Accept at all. Instead:

  • Use different URLs for ActivityPub (e.g., /author/username?format=activitypub)
  • Or handle normalization at the cache layer (Varnish/NGINX can normalize Accept headers before lookup)
  • Or simply don’t cache ActivityPub responses (they’re low-traffic requests from Mastodon servers)

PHP can’t help here — once Vary: Accept is in the response, upstream caches will fragment based on the raw header value regardless of what PHP does internally.

Mistake #3: Applied to ALL Sites, Not Just ActivityPub Users

Here’s the most absurd part: these Vary headers are sent on every Jetpack site, whether or not ActivityPub is enabled.

My site doesn’t use ActivityPub. I have no Fediverse integration. Yet Jetpack is fragmenting my cache to support a feature I don’t use.

There’s no filter to disable it. The headers are sent before any WordPress filter runs:

// Headers sent on line 57
header( 'Vary: ' . implode( ', ', $vary_header_parts ) );

// Filter runs on line 69 — too late!
return (bool) apply_filters( 'jetpack_is_frontend', $is_frontend, $send_vary_headers );

The Pattern: Jetpack’s History of Cache Destruction

This isn’t an isolated incident. Jetpack has a long history of cache-hostile behavior.

WP Super Cache + ActivityPub Conflict (2023)

Issue #33968 documented how WP Super Cache served cached HTML to Fediverse clients requesting JSON. The “fix” was to add Vary headers everywhere — the nuclear option that impacts all users to solve a niche use case.

External Request Blocking

Jetpack’s stats module makes external requests to pixel.wp.com on every admin page load. Users report seeing “Waiting for pixel.wp.com” in their browser, with admin pages taking 3+ seconds to load.

The module phones home even when you’re just trying to edit a post. Every. Single. Page. Load.

Cron Job Bloat

Jetpack registers numerous wp-cron jobs (jetpack_sync_cron, jetpack_v2_heartbeat, etc.) that can bloat the cron table and consume server resources. Sites have reported hundreds of orphaned Jetpack cron events after years of use.

The Photon “Optimization”

Jetpack’s Site Accelerator (Photon) routes your images through WordPress.com’s CDN. Sounds helpful until:

  • Your images are now dependent on WordPress.com’s uptime
  • The CDN introduces latency for visitors geographically far from their edge nodes
  • You lose control over cache headers and image optimization parameters
  • Some security plugins block Photon’s user-agent, breaking your images silently

The AI Era Irony

We’re in 2026. Large language models can write code, review pull requests, and catch bugs. Every major AI assistant — ChatGPT, Claude, Gemini — knows HTTP semantics better than the developers who shipped this code.

A simple prompt catches the bug instantly:

“Review this PHP code for HTTP standards compliance:
$vary_header_parts = array( 'accept', 'content-type' );

Any AI responds:

“Content-Type should not be included in the Vary header. Vary specifies which request headers affect the response. Content-Type is a response header describing what the server sends back, not what the client requested. This violates RFC 7231.”

Automattic is a billion-dollar company. They employ hundreds of engineers. They have access to every AI tool. They run one of the most popular WordPress plugins in existence.

And yet, basic HTTP violations ship to millions of sites and persist for years.

This isn’t a “skill issue.” It’s a process issue. It’s a culture that prioritizes shipping features over correctness, that treats web standards as suggestions, and that apparently never runs responses through a simple standards compliance check.

The Fix: What You Can Do

Since Jetpack doesn’t provide a filter to disable these headers, you have to strip them yourself to restore your cache efficiency.

In your NGINX server configuration, after proxying to your backend:

location / {
    proxy_pass http://backend;

    # Strip Jetpack's cache-killing Vary header
    proxy_hide_header Vary;
    add_header Vary "Accept-Encoding" always;
}

Or if using FastCGI directly:

fastcgi_hide_header Vary;
add_header Vary "Accept-Encoding" always;

Option 2: Varnish VCL

If you’re running Varnish as a caching layer:

sub vcl_backend_response {
    # Normalize Vary header
    if (beresp.http.Vary) {
        set beresp.http.Vary = "Accept-Encoding";
    }
}

Option 3: Apache

Header always unset Vary
Header always set Vary "Accept-Encoding"

Option 4: PHP (Hacky)

Create a must-use plugin at wp-content/mu-plugins/fix-vary-header.php:

<?php
add_action('send_headers', function() {
    header_remove('Vary');
    header('Vary: Accept-Encoding');
}, PHP_INT_MAX);

This runs late in the request lifecycle, overwriting whatever garbage Jetpack added.

The Proper Fix: Upstream

I’ve submitted a pull request to Jetpack to remove the incorrect content-type from the Vary header. It’s a minimal fix — just removing the RFC violation.

A complete fix would:

  1. Remove content-type entirely (done in my PR)
  2. Add a filter to disable Vary headers for sites not using ActivityPub
  3. Use different URLs for ActivityPub responses instead of content negotiation

Whether Automattic accepts the PR or lets it rot for months is another matter.

Conclusion: Trust, But Verify

Jetpack is installed on over 5 million WordPress sites. It’s the official plugin from Automattic, the company behind WordPress.com. You’d expect enterprise-grade code quality.

Instead, you get:

  • HTTP standard violations that any AI could catch
  • Performance-destroying defaults with no opt-out
  • Cache fragmentation affecting every site, not just those needing the feature
  • External dependencies that slow your admin and threaten uptime

The lesson? Trust no plugin blindly. Monitor your response headers. Check your cache hit rates. Profile your admin performance.

And maybe — just maybe — the era of “install Jetpack for everything” should end. There are alternatives for stats (Plausible, Fathom, self-hosted Matomo). There are better CDNs than Photon. There are simpler solutions than a 200MB monolith that violates HTTP specs.

Your cache efficiency returns when you take control of your headers.


Have you noticed Jetpack impacting your site’s performance or caching? Share your experience in the comments below.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.