Skip to main content

NGINX

NGINX Query String Normalization with sorted-args Module

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

NGINX query string normalization solves cache fragmentation – the silent performance killer that plagues high-traffic websites. When users, APIs, or frameworks send identical parameters in different orders, your cache treats them as separate requests. The sorted-args module fixes this problem at the edge, dramatically improving cache efficiency.

The Query String Order Problem

Consider these three URLs requesting the same content:

/api/products?category=electronics&sort=price&page=1
/api/products?page=1&category=electronics&sort=price
/api/products?sort=price&page=1&category=electronics

Without NGINX query string normalization, each URL creates a separate cache entry. This wastes storage, reduces hit rates, and forces unnecessary backend processing. The problem compounds quickly with tracking parameters like UTM codes, timestamps, and cache-busters.

The sorted-args module provides $sorted_args. It normalizes all three URLs to: category=electronics&page=1&sort=price. Now they share one cache entry, eliminating redundant backend requests.

Why Parameter Order Matters for Caching

Modern web applications face parameter order chaos from multiple sources:

  • JavaScript frameworks construct URLs with unpredictable parameter ordering
  • Social media platforms append tracking codes when users share links
  • Analytics tools add UTM parameters in varying sequences
  • API clients may serialize objects differently each time
  • Browser extensions inject additional parameters
  • CDN edge nodes may reorder parameters during processing

Each variation creates a new cache entry for identical content. A single product page could have dozens or even hundreds of cache entries. NGINX query string normalization eliminates this waste entirely, consolidating all variations into one cached response.

How the sorted-args Module Works

The module implements a sophisticated normalization pipeline:

  1. Parameter Extraction – Parses the query string on & and = delimiters
  2. Natural Sorting – Sorts alphabetically using natural order (item2 before item10)
  3. Empty Value Stripping – Removes ?a=&b=2 style empty values automatically
  4. Filtering – Applies allowlist or blocklist patterns to include or exclude parameters
  5. Deduplication – Keeps first or last occurrence of duplicate keys (in sorted order)
  6. Reconstruction – Joins parameters into a clean normalized string

Processing happens once per request. Results are cached in the request context. This minimizes overhead when $sorted_args appears multiple times in your configuration.

Installation

RHEL, CentOS, AlmaLinux, Rocky Linux

Install from the GetPageSpeed repository:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-sorted-args

Load the module in /etc/nginx/nginx.conf:

load_module modules/ngx_http_sorted_args_module.so;

See the sorted-args module page for version details and changelog.

Configuration Directives

The $sorted_args Variable

The $sorted_args variable contains the normalized query string. Use it anywhere NGINX accepts variables:

# In cache keys
proxy_cache_key "$scheme$host$uri$sorted_args";

# In logging
log_format detailed '$request_uri sorted_args="$sorted_args"';

# In proxy requests
proxy_pass http://backend$uri?$sorted_args;

Behavior:
– Empty queries return empty strings
– Flags like ?debug stay as debug
– Empty values like ?debug= get stripped
– Sorting is case-insensitive for parameter names
– Natural sort puts p1, p2, p10 in correct numerical order

sorted_args_ignore_list

Syntax: sorted_args_ignore_list pattern [pattern ...];
Default: none
Context: http, server, location, if

Excludes parameters from $sorted_args. Use it to remove cache-busters, tracking codes, and session identifiers that should not affect caching.

Pattern Types:
name – exact match (case-insensitive)
name* – prefix match (matches name_anything)
*name – suffix match (matches anything_name)
*name* – contains match (matches parameters containing name)

location /api {
    sorted_args_ignore_list timestamp _ t utm_* fbclid gclid;

    proxy_cache api_cache;
    proxy_cache_key "$uri$sorted_args";
    proxy_pass http://backend;
}

A request to /api?user=123&timestamp=1234567890&utm_source=google outputs user=123.

sorted_args_allow_list

Syntax: sorted_args_allow_list pattern [pattern ...];
Default: none
Context: http, server, location, if

Keeps only parameters matching patterns. All others are dropped. This provides strict control over cache keys.

location /search {
    sorted_args_allow_list q page limit sort category;

    proxy_cache search_cache;
    proxy_cache_key "$uri$sorted_args";
    proxy_pass http://search_backend;
}

A request to /search?q=nginx&page=1&debug=true&nocache=1 outputs page=1&q=nginx.

Combining Allowlist and Blocklist:

Both directives work together for fine-grained control. Allowlist applies first, then blocklist:

location /api {
    sorted_args_allow_list user_id action page timestamp;
    sorted_args_ignore_list timestamp;

    proxy_cache_key "$uri$sorted_args";
    proxy_pass http://api_backend;
}

sorted_args_overwrite

Syntax: sorted_args_overwrite on | off;
Default: off
Context: http, server, location, if

Overwrites $args during the rewrite phase. All downstream processing sees normalized args automatically. This is useful when you want all logging, proxying, and redirects to use the normalized query string.

location /api {
    sorted_args_overwrite on;
    sorted_args_ignore_list timestamp;

    proxy_pass http://backend;
}

A request to /api?z=1&a=2&timestamp=123 proxies as /api?a=2&z=1. The timestamp is removed and parameters are sorted.

sorted_args_dedupe

Syntax: sorted_args_dedupe first | last | off;
Default: off
Context: http, server, location, if

Handles duplicate parameter keys. Deduplication happens after sorting, so “first” and “last” refer to sorted order:

  • first – keep first occurrence in sorted order
  • last – keep last occurrence in sorted order
  • off – keep all occurrences (default)
location /search {
    sorted_args_dedupe first;

    proxy_cache_key "$uri$sorted_args";
    proxy_pass http://backend;
}

For /search?q=foo&q=bar&q=baz:
– Without dedupe, sorted output is q=bar&q=baz&q=foo
first returns q=bar (first in sorted order)
last returns q=foo (last in sorted order)

Real-World Configuration Examples

Basic Cache Key Normalization

Normalize cache keys regardless of parameter order:

http {
    proxy_cache_path /var/cache/nginx levels=1:2 
                     keys_zone=content_cache:100m 
                     inactive=7d max_size=10g;

    server {
        listen 80;

        location / {
            proxy_cache content_cache;
            proxy_cache_key "$scheme$host$uri$sorted_args";
            proxy_cache_valid 200 1h;
            proxy_pass http://backend;
        }
    }
}

For comprehensive caching setup, see the NGINX Proxy Cache & Microcaching Guide.

E-commerce API with Tracking Removal

Remove tracking while preserving product filters:

location /api/products {
    sorted_args_ignore_list 
        _ t timestamp v version 
        utm_* fbclid gclid msclkid 
        *_ref *_source;

    proxy_cache product_cache;
    proxy_cache_key "$uri$sorted_args";
    proxy_cache_valid 200 5m;
    proxy_pass http://product_api;
}

This configuration handles all common tracking parameters. UTM codes from Google Analytics are removed. Facebook and Microsoft click IDs disappear. Cache efficiency improves dramatically.

Search API with Strict Control

Allow only specific parameters for search caching:

location /api/search {
    sorted_args_allow_list q query search page limit offset sort order;
    sorted_args_ignore_list debug verbose trace;

    proxy_cache search_cache;
    proxy_cache_key "$uri$sorted_args";
    proxy_cache_valid 200 30s;
    proxy_pass http://search_service;
}

CDN Origin Configuration

Use overwrite mode for CDN compatibility:

location /static {
    sorted_args_overwrite on;
    sorted_args_ignore_list 
        _ t timestamp nocache 
        utm_* fbclid gclid;
    sorted_args_dedupe first;

    add_header X-Cache-Key "$uri?$args";
    proxy_pass http://origin;
}

This ensures CDNs receive normalized URLs. Cloudflare, Fastly, and CloudFront all benefit from consistent cache keys.

Logging Both Query Strings

Track normalization in access logs for analysis:

http {
    log_format normalized '$remote_addr [$time_local] "$request" $status '
                          'original_args="$args" '
                          'sorted_args="$sorted_args" '
                          'cache_status="$upstream_cache_status"';

    server {
        access_log /var/log/nginx/access.log normalized;

        location / {
            sorted_args_ignore_list utm_* _ timestamp;
            proxy_cache_key "$uri$sorted_args";
            proxy_pass http://backend;
        }
    }
}

Combining with Cache Purge

For dynamic content that changes frequently, combine sorted-args with the ngx_cache_purge module. This powerful combination provides:

  • Normalized cache keys – Sorted-args ensures consistent cache entries
  • Selective purging – Cache-purge allows invalidating specific URLs
  • Wildcard purging – Clear entire URL patterns when content changes
http {
    proxy_cache_path /var/cache/nginx levels=1:2 
                     keys_zone=content:100m purger=on;

    server {
        location / {
            proxy_cache content;
            proxy_cache_key "$uri$sorted_args";
            proxy_pass http://backend;
        }

        location ~ /purge(/.*) {
            allow 127.0.0.1;
            deny all;
            proxy_cache_purge content "$1$sorted_args";
        }
    }
}

When purging, use the same normalized key format. This ensures purge requests match cached entries regardless of the original parameter order.

Performance Considerations

Cache Hit Rate Improvement

NGINX query string normalization improves cache efficiency by 15-40% in production environments. Some sites see even larger improvements depending on their traffic patterns and parameter diversity.

Biggest gains occur when:
– Multiple frameworks generate URLs differently
– Users share socially-modified links
– APIs construct queries programmatically
– Analytics tools append various tracking codes

Processing Overhead

The module adds minimal overhead:

  • Memory: Uses NGINX’s request pool (no extra allocation needed)
  • CPU: Efficient queue-based sorting algorithm
  • Caching: One computation per request, results reused
  • Filtering: Fast case-insensitive pattern matching

Cache hit improvements far outweigh sorting costs. Even high-traffic sites see negligible CPU impact from the sorting operation.

CDN Compatibility

With upstream CDNs, enable sorted_args_overwrite on;. This passes normalized strings upstream. CDN cache fragmentation disappears just like NGINX fragmentation.

For buffer tuning, see the NGINX Tuning Module.

Testing Your Configuration

Use the NGINX Echo Module for testing. Or verify directly with curl commands:

# Test basic sorting
curl -s "http://localhost/test?z=3&a=1&m=2"
# Returns: a=1&m=2&z=3

# Test ignore list
curl -s "http://localhost/test?user=123×tamp=456&utm_source=test"
# Returns: user=123

# Test natural sorting
curl -s "http://localhost/test?item10=a&item2=b&item1=c"
# Returns: item1=c&item2=b&item10=a

Verify cache normalization works correctly:

# First request - MISS
curl -I "http://localhost/api?b=2&a=1"
# X-Cache-Status: MISS

# Different order - should be HIT
curl -I "http://localhost/api?a=1&b=2"
# X-Cache-Status: HIT

Both requests produce identical cache keys. The second request hits the cache immediately.

Troubleshooting

Module Not Loading

For unknown directive "sorted_args_ignore_list", add to nginx.conf:

load_module modules/ngx_http_sorted_args_module.so;

Place this directive before the events block at the top of the configuration file.

Empty $sorted_args

The variable returns empty when:
– No query string exists
– All parameters were filtered out
– All values were empty (?a=&b=)

This is expected behavior. Empty cache key components are valid and work correctly.

Unexpected Filtering

Check pattern matching syntax:
utm_* matches utm_source, utm_medium, utm_campaign
*_id matches user_id, session_id, client_id
*token* matches auth_token, tokenizer, access_token

All pattern matching is case-insensitive for parameter names.

Conclusion

The sorted-args module provides essential NGINX query string normalization for modern web infrastructure. It eliminates cache fragmentation from parameter order variations, tracking codes, and cache-busters.

For high-traffic sites, cache improvements easily justify the minimal overhead. Combined with cache purging and CDN normalization, the module ensures consistent caching behavior across your entire infrastructure stack.

Source code and documentation: GitHub.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.