Skip to main content

NGINX / Server Setup

NGINX mod_zip: Dynamic ZIP Archives Guide

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

When users need to download multiple files from your server, the traditional approach involves pre-generating ZIP archives on disk or assembling them entirely in memory before sending. Both methods have significant drawbacks: disk-based archives consume storage and require background jobs, while memory-based assembly can exhaust server RAM when dealing with large files.

The NGINX mod_zip module solves this problem elegantly by streaming ZIP archives dynamically. It assembles archives on-the-fly, pulling component files from disk or upstream servers while maintaining minimal memory footprint—typically just a few kilobytes regardless of archive size. This makes the module an ideal solution for file download portals, document management systems, and any application serving bundled files.

How mod_zip Works

Unlike traditional ZIP generation that loads entire files into memory, mod_zip operates as a filter module. Here is the workflow:

  1. Your backend application returns a special response with the X-Archive-Files: zip header
  2. The response body contains a manifest listing files to include in the archive
  3. NGINX intercepts this response and begins streaming a ZIP archive to the client
  4. Files are fetched via internal subrequests and streamed directly into the ZIP without buffering

This streaming architecture means you can serve gigabyte-sized archives while using only kilobytes of RAM. The NGINX mod_zip module supports “modern” ZIP features including:

  • ZIP64 extensions for archives larger than 4GB
  • UTF-8 filenames for international character support
  • UTC timestamps for consistent file dates
  • Range requests for resumable downloads (when CRC-32 checksums are provided)
  • Empty directories via the @directory marker

Installation on Rocky Linux, AlmaLinux, and RHEL

The easiest way to install the module is through the GetPageSpeed repository, which provides pre-built packages for all major Enterprise Linux distributions.

Enable the GetPageSpeed Repository

If you have not already enabled the repository:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm

Install the Module

sudo dnf install nginx-module-zip

This installs the module compatible with your NGINX version.

Load the Module

Add the following line to the top of your /etc/nginx/nginx.conf, before any http block:

load_module modules/ngx_http_zip_module.so;

Alternatively, you can enable all installed modules automatically:

include /usr/share/nginx/modules/*.conf;

Reload NGINX to apply changes:

sudo nginx -t && sudo systemctl reload nginx

Installation on Debian and Ubuntu

For Debian-based distributions, the module is available through the GetPageSpeed APT repository.

Configuration

mod_zip is a filter module that requires no configuration directives. It activates automatically when it detects the X-Archive-Files: zip header in upstream responses.

Basic Configuration Example

Here is a complete working configuration that uses NGINX reverse proxy capabilities:

load_module modules/ngx_http_zip_module.so;

http {
    upstream backend {
        server 127.0.0.1:8080;
    }

    server {
        listen 80;
        server_name example.com;

        # Endpoint that triggers ZIP generation
        location /download-archive {
            proxy_pass http://backend;
            # Important: disable compression for backend responses
            proxy_set_header Accept-Encoding "";
        }

        # Internal location for serving individual files
        location /files/ {
            internal;
            alias /var/www/files/;
        }
    }
}

Backend Response Format

Your backend application must return two things:

  1. The X-Archive-Files: zip header
  2. A body containing the file manifest

The manifest format is one file per line with space-separated fields:

CRC32 SIZE PATH FILENAME

Example manifest:

f2efaf5e 15234 /files/document.pdf reports/Q1-Report.pdf
8a60ba74 892456 /files/image.jpg photos/vacation.jpg
0 0 @directory images/

Field descriptions:

  • CRC32: The file’s CRC-32 checksum in hexadecimal, or - to calculate on-the-fly
  • SIZE: File size in bytes
  • PATH: Internal NGINX location path (must be properly URL-encoded)
  • FILENAME: The name that appears inside the ZIP archive (can include directory paths)

Dynamic CRC-32 Calculation

If you do not know the CRC-32 checksum beforehand, use - as a placeholder:

- 15234 /files/document.pdf reports/Q1-Report.pdf
- 892456 /files/image.jpg photos/vacation.jpg

However, note that using - disables Range request support, meaning clients cannot resume interrupted downloads.

Creating Empty Directories

To include empty directories in the archive, use the special @directory marker:

0 0 @directory images/
0 0 @directory documents/templates/

The CRC-32 and size must both be 0 for directory entries.

PHP Backend Example

Here is a practical PHP example that generates the file manifest:

<?php
// download.php - Generates file manifest for mod_zip

$files = [
    ['path' => '/files/report.pdf', 'name' => 'Reports/Annual-Report-2024.pdf', 'size' => 245678],
    ['path' => '/files/data.xlsx', 'name' => 'Data/Financial-Summary.xlsx', 'size' => 89234],
    ['path' => '/files/readme.txt', 'name' => 'README.txt', 'size' => 1234],
];

// Set required headers
header('X-Archive-Files: zip');
header('Content-Type: application/zip');
header('Content-Disposition: attachment; filename="download-package.zip"');

// Generate manifest
foreach ($files as $file) {
    // Using - for CRC means the module calculates it on-the-fly
    echo "- {$file['size']} {$file['path']} {$file['name']}\n";
}

Python Backend Example with Flask

For Python applications using Flask:

from flask import Flask, Response

app = Flask(__name__)

@app.route('/download-archive')
def download_archive():
    files = [
        {'path': '/files/document.pdf', 'name': 'docs/document.pdf', 'size': 12345},
        {'path': '/files/image.png', 'name': 'images/photo.png', 'size': 67890},
    ]

    # Build manifest for mod_zip
    manifest = '\n'.join(
        f"- {f['size']} {f['path']} {f['name']}" for f in files
    )

    headers = {
        'X-Archive-Files': 'zip',
        'Content-Type': 'application/zip',
        'Content-Disposition': 'attachment; filename="archive.zip"',
    }

    return Response(manifest, headers=headers)

Proxying Files from Remote Servers

The module can fetch files from upstream servers, not just local disk. This is powerful for distributed file storage:

http {
    upstream storage_backend {
        server storage1.internal:8080;
        server storage2.internal:8080;
    }

    server {
        listen 80;
        server_name downloads.example.com;

        location /download {
            proxy_pass http://app_backend;
            proxy_set_header Accept-Encoding "";
        }

        # Proxy file requests to storage cluster
        location /storage/ {
            internal;
            proxy_pass http://storage_backend/;
            proxy_set_header Accept-Encoding "";
        }
    }
}

Your manifest would then reference the storage location:

- 1048576 /storage/bucket1/file1.dat data/file1.dat
- 2097152 /storage/bucket2/file2.dat data/file2.dat

If you encounter issues with upstream responses, check out our guide on tuning proxy_buffer_size.

Enabling Resumable Downloads

To support HTTP Range requests (allowing clients to resume interrupted downloads), you must provide CRC-32 checksums and a Last-Modified header:

<?php
header('X-Archive-Files: zip');
header('Content-Disposition: attachment; filename="archive.zip"');
header('Last-Modified: ' . gmdate('D, d M Y H:i:s', filemtime('/var/www/files/latest')) . ' GMT');

// Pre-calculated CRC-32 values
$files = [
    ['crc' => 'f2efaf5e', 'size' => 15234, 'path' => '/files/doc.pdf', 'name' => 'doc.pdf'],
    ['crc' => '8a60ba74', 'size' => 89234, 'path' => '/files/img.jpg', 'name' => 'img.jpg'],
];

foreach ($files as $f) {
    echo "{$f['crc']} {$f['size']} {$f['path']} {$f['name']}\n";
}

The client can then use Range and If-Range headers to resume downloads.

Handling UTF-8 Filenames and Character Encoding

The module supports UTF-8 filenames natively. For legacy systems that require specific character encodings, use the X-Archive-Charset header:

// Convert filenames to UTF-8 from another encoding
header('X-Archive-Charset: ISO-8859-1');

For native system charset (disabling UTF-8 flag):

header('X-Archive-Charset: native');

Forwarding Headers to Subrequests

If your file storage requires authentication headers, use X-Archive-Pass-Headers to forward them:

// Forward Authorization header to file subrequests
header('X-Archive-Pass-Headers: Authorization:X-Custom-Token');
header('X-Archive-Files: zip');

Headers are separated by colons in the header value.

Performance Optimization Tips

To maximize performance when using this module:

1. Disable Gzip for Backend Responses

The module cannot process pre-compressed data. Always disable compression for backend responses:

location /download {
    proxy_pass http://backend;
    proxy_set_header Accept-Encoding "";
}

2. Use Direct File Access When Possible

For local files, use alias or root directives with internal locations:

location /files/ {
    internal;
    alias /var/www/files/;
}

3. Strip X-Archive-Files Header from Client Response

While not strictly necessary, you can use the headers_more module to remove the internal header from client responses:

more_clear_headers 'X-Archive-Files';

4. Pre-calculate CRC-32 Checksums

For better client experience with resumable downloads, pre-calculate and store CRC-32 values in your database:

import binascii

def calculate_crc32(filepath):
    with open(filepath, 'rb') as f:
        return format(binascii.crc32(f.read()) & 0xffffffff, '08x')

Testing Your Configuration

To verify everything is working correctly:

# Test the endpoint
curl -o archive.zip http://localhost/download-archive

# Verify the ZIP file
unzip -l archive.zip

# Extract and verify contents
unzip archive.zip -d extracted/

Check the NGINX error log for any issues:

tail -f /var/log/nginx/error.log

Common errors include:

  • Permission denied: Ensure NGINX can read the source files
  • 404 on subrequests: Verify your internal location paths match the manifest
  • Empty archive: Check that the backend is sending the correct headers

You can also use our free NGINX config checker to validate your configuration syntax.

Security Considerations

When implementing dynamic ZIP archives, keep these security practices in mind:

Use Internal Locations

Always mark file-serving locations as internal to prevent direct access:

location /files/ {
    internal;
    alias /var/www/protected-files/;
}

Validate File Paths

Ensure your backend validates and sanitizes file paths to prevent directory traversal attacks:

$allowed_dir = '/var/www/files/';
$requested_file = realpath($allowed_dir . $user_input);

if (strpos($requested_file, $allowed_dir) !== 0) {
    http_response_code(403);
    exit('Access denied');
}

Rate Limiting

Large archive downloads can consume bandwidth. Consider implementing rate limiting using the limit_traffic_rate module:

limit_req_zone $binary_remote_addr zone=downloads:10m rate=5r/m;

server {
    location /download-archive {
        limit_req zone=downloads burst=10 nodelay;
        proxy_pass http://backend;
    }
}

Troubleshooting Common Issues

Archive Downloads as Empty or Corrupted

Cause: Backend returning gzipped content.

Solution: Add proxy_set_header Accept-Encoding ""; to prevent compression.

Files Not Found in Subrequests

Cause: Incorrect paths in manifest or missing internal locations.

Solution: Verify paths match your NGINX configuration. Check error logs for 404 errors on subrequests.

SELinux Blocking File Access

Cause: SELinux denying NGINX access to files.

Solution: Set correct file contexts:

restorecon -Rv /var/www/files/
setsebool -P httpd_can_network_connect 1

Range Requests Not Working

Cause: Using - for CRC-32 in manifest.

Solution: Pre-calculate and provide CRC-32 checksums for all files.

502 Bad Gateway Errors

If you see 502 errors, check that your backend is properly responding with the file manifest.

If you are working with file downloads and archives in NGINX, you might also find these modules useful:

Conclusion

The NGINX mod_zip module provides an elegant solution for serving dynamic ZIP archives without the memory and storage overhead of traditional approaches. By streaming archives on-the-fly, you can serve files of any size while maintaining minimal resource usage.

Key benefits include:

  • Memory efficient: Uses only kilobytes of RAM regardless of archive size
  • No disk overhead: Archives are generated dynamically, no pre-generation needed
  • Resumable downloads: Supports Range requests when CRC-32 checksums are provided
  • Flexible architecture: Works with local files or remote upstream servers
  • Modern ZIP support: UTF-8 filenames, ZIP64, and empty directories

For production deployments, pre-calculating CRC-32 checksums enables the best user experience with resumable downloads. The module integrates seamlessly with any backend that can generate the simple manifest format.

The source code and additional documentation are available on GitHub.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.