Skip to main content

NGINX

NGINX OpenTelemetry Module: Distributed Tracing

by ,


We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

The NGINX OpenTelemetry module adds distributed tracing to your NGINX server. It makes NGINX a first-class participant in your OpenTelemetry pipeline. Every request gets a trace span with timing data, HTTP metadata, and context propagation headers — exported to your collector via OTLP/gRPC.

When a user reports that your application is slow, you need to know where time is spent. Is it the NGINX reverse proxy? The upstream app server? A database query three services deep? Without distributed tracing, answering this requires guesswork and log correlation across systems.

Unlike third-party tracing solutions that impose 50% or more overhead, this module is developed by F5/NGINX Inc. and optimized for NGINX internals. The performance impact is typically 10–15%, making it viable for production use. If you are looking for other ways to improve your NGINX server setup, distributed tracing is an excellent place to start.

How the NGINX OpenTelemetry Module Works

The module hooks into two phases of the NGINX request lifecycle:

  1. Rewrite phase: Extracts trace context from incoming traceparent and tracestate headers (W3C Trace Context standard), creates a new span, and injects updated headers into upstream requests
  2. Log phase: Collects HTTP attributes (method, status code, timing), finalizes the span, and adds it to an export batch

Spans are batched in memory and flushed to your collector at set intervals via OTLP/gRPC. The export runs in a background thread and does not block request processing.

Each NGINX worker process maintains its own batch buffer. This lock-free design avoids contention between workers and keeps overhead low under high concurrency.

Automatic Span Attributes

Every span includes these attributes per OpenTelemetry HTTP semantic conventions:

Attribute Example Value Description
http.method GET HTTP request method
http.target /api/users?page=2 Request URI with query string
http.route /api/users Matched NGINX location
http.scheme https Request scheme
http.flavor 2.0 HTTP protocol version
http.user_agent curl/8.5.0 User-Agent header
http.status_code 200 Response status code
http.request_content_length 1024 Request body size
http.response_content_length 8192 Response body size
net.host.name api.example.com Server name
net.host.port 443 Port (omitted if default)
net.sock.peer.addr 192.168.1.100 Client IP address
net.sock.peer.port 52431 Client port

Spans with HTTP status codes 500 or higher get an error status automatically.

Installation

RHEL, CentOS, AlmaLinux, Rocky Linux

Install the GetPageSpeed repository and the module:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-otel

Load the module by adding this line at the top of /etc/nginx/nginx.conf, before any http block:

load_module modules/ngx_otel_module.so;

Verify the module loads correctly:

nginx -t

If the test passes with otel_* directives in your config, the module is active.

For more details, see the RPM package page.

Configuration Directives

The NGINX OpenTelemetry module provides seven directives for controlling trace collection and export.

otel_exporter

Configures the collector endpoint and batching behavior.

Syntax: otel_exporter { ... }
Context: http
Required: Yes, if any otel_trace directive is enabled

This block accepts the following nested directives:

Directive Default Description
endpoint (required) OTLP/gRPC endpoint, e.g., localhost:4317
interval 5s Max time between batch exports
batch_size 512 Max spans per batch per worker
batch_count 4 Pending batches per worker before drops

Example:

otel_exporter {
    endpoint localhost:4317;
    interval 5s;
    batch_size 512;
    batch_count 4;
}

The batch_size and batch_count values control memory use. Each worker holds up to batch_size × batch_count spans. With defaults, that is 2048 spans per worker. Spans exceeding this limit are dropped if the collector is slow.

otel_service_name

Sets the service.name resource attribute for your NGINX instance.

Syntax: otel_service_name name;
Default: unknown_service:nginx
Context: http

Example:

otel_service_name "api-gateway";

Choose a name that matches your infrastructure conventions. This name appears in Jaeger, Grafana Tempo, and other tracing backends as the primary filter.

otel_resource_attr

Adds custom resource-level attributes to all spans. Resource attributes describe the entity producing telemetry, not individual requests. Use them for environment, host, and deployment metadata.

Syntax: otel_resource_attr name value;
Default:
Context: http

This directive is repeatable:

otel_service_name "api-gateway";
otel_resource_attr "deployment.environment" "production";
otel_resource_attr "host.name" "web-01";
otel_resource_attr "service.version" "2.4.1";

Resource attributes appear on every span from this NGINX instance. They are ideal for filtering traces by environment, region, or deployment version in your tracing backend.

otel_trace

Enables or disables tracing. This is the main switch.

Syntax: otel_trace on | off | $variable;
Default: off
Context: http, server, location

When set to a variable, tracing is active when the variable resolves to 1, on, or true. This enables per-request sampling decisions.

Trace all requests:

server {
    otel_trace on;

    location / {
        proxy_pass http://backend;
    }
}

Disable tracing for health checks:

location /health {
    otel_trace off;
    return 200 "OK";
}

otel_trace_context

Controls W3C Trace Context header propagation.

Syntax: otel_trace_context extract | inject | propagate | ignore;
Default: ignore
Context: http, server, location

Mode Behavior
ignore No trace context headers read or written
extract Reads traceparent/tracestate from the client
inject Writes trace context into upstream requests
propagate Extract + inject (full participation)

For reverse proxy setups, use propagate so NGINX joins the distributed trace:

location /api/ {
    otel_trace on;
    otel_trace_context propagate;
    proxy_pass http://backend;
}

Use inject when NGINX is the trace origin with no incoming context from clients.

otel_span_name

Sets a custom span name. Defaults to the matched NGINX location.

Syntax: otel_span_name name;
Default: location name
Context: http, server, location

Supports NGINX variables for dynamic names:

location /api/ {
    otel_span_name "api-$request_method";
    otel_trace on;
    proxy_pass http://backend;
}

This produces names like api-GET and api-POST in your tracing UI.

otel_span_attr

Adds custom key-value attributes to the span. Repeatable.

Syntax: otel_span_attr name value;
Default:
Context: http, server, location

location /api/ {
    otel_trace on;
    otel_trace_context propagate;
    otel_span_attr "deployment.environment" "production";
    otel_span_attr "request.id" $request_id;
    otel_span_attr "upstream.addr" $upstream_addr;
    otel_span_attr "upstream.response_time" $upstream_response_time;
    proxy_pass http://backend;
}

Adding $upstream_addr and $upstream_response_time lets you pinpoint slow backends in your tracing UI. These attributes are invaluable for debugging latency.

Note the difference: otel_span_attr adds per-request attributes to individual spans, while otel_resource_attr adds static metadata to all spans from this NGINX instance.

Embedded Variables

The module exposes four variables for use in configs:

Variable Description Example Value
$otel_trace_id 32-char hex trace identifier 4bf92f3577b34da6a3ce929d0e0e4736
$otel_span_id 16-char hex span identifier 00f067aa0ba902b7
$otel_parent_id Parent span ID from incoming request dc94d281b0f884ea
$otel_parent_sampled Parent sampling flag (1 or 0) 1

These variables help correlate NGINX access logs with traces:

log_format traced '$remote_addr - [$time_local] "$request" $status '
                  'trace=$otel_trace_id span=$otel_span_id';

access_log /var/log/nginx/access.log traced;

Here is what these variables look like in actual log output:

127.0.0.1 - [17/Mar/2026:16:02:30 +0800] "GET / HTTP/1.1" 200 trace=68d4ba8f38c8bb01929f30445e930130 span=5d2792c4acabfd31

When a request arrives with a traceparent header, the parent context is extracted:

127.0.0.1 - [17/Mar/2026:16:02:30 +0800] "GET / HTTP/1.1" 200 trace=4bf92f3577b34da6a3ce929d0e0e4736 span=7197bdc8bf3adeae parent=00f067aa0ba902b7 sampled=1

Notice that the trace value matches the trace ID from the incoming traceparent header, confirming context propagation works.

You can also pass the trace ID to upstream services:

proxy_set_header X-Trace-Id $otel_trace_id;

Sampling Strategies

Tracing every request creates enormous data volumes. The NGINX OpenTelemetry module supports several sampling approaches.

Ratio-Based Sampling

Use NGINX’s split_clients to sample a percentage of requests:

split_clients "$otel_trace_id" $ratio_sampler {
    10%     on;
    *       off;
}

server {
    location / {
        otel_trace $ratio_sampler;
        otel_trace_context inject;
        proxy_pass http://backend;
    }
}

This traces about 10% of requests. The hash is based on $otel_trace_id, so decisions are consistent per trace.

Parent-Based Sampling

Honor the upstream caller’s sampling decision:

server {
    location / {
        otel_trace $otel_parent_sampled;
        otel_trace_context propagate;
        proxy_pass http://backend;
    }
}

This is ideal for microservice architectures. The gateway decides which requests to trace, and downstream services follow.

Combined Sampling

Trace sampled parents AND a percentage of unsampled traffic:

split_clients "$otel_trace_id" $ratio_sample {
    5%      on;
    *       off;
}

map "$otel_parent_sampled:$ratio_sample" $should_trace {
    "1:on"  on;
    "1:off" on;
    "0:on"  on;
    default off;
}

server {
    location / {
        otel_trace $should_trace;
        otel_trace_context propagate;
        proxy_pass http://backend;
    }
}

This traces all requests with a sampled parent, plus 5% of requests without one. In testing, this configuration correctly identified sampled parents and applied ratio-based fallback sampling to remaining traffic.

Complete Production Configuration

Here is a full config for a reverse proxy with OpenTelemetry tracing:

load_module modules/ngx_otel_module.so;

events {
    worker_connections 1024;
}

http {
    otel_exporter {
        endpoint localhost:4317;
        interval 5s;
        batch_size 512;
        batch_count 4;
    }

    otel_service_name "web-gateway";
    otel_resource_attr "deployment.environment" "production";
    otel_resource_attr "host.name" "web-01";

    split_clients "$otel_trace_id" $ratio_sampler {
        10%     on;
        *       off;
    }

    log_format traced '$remote_addr - [$time_local] "$request" $status '
                      '$body_bytes_sent "$http_referer" '
                      'trace=$otel_trace_id span=$otel_span_id '
                      'rt=$request_time';

    access_log /var/log/nginx/access.log traced;

    server {
        listen 80;
        server_name example.com;

        location / {
            otel_trace $ratio_sampler;
            otel_trace_context propagate;
            otel_span_attr "upstream.addr" $upstream_addr;
            proxy_pass http://127.0.0.1:8080;
        }

        location /health {
            otel_trace off;
            return 200 "OK\n";
        }

        location /static/ {
            otel_trace off;
            root /var/www;
        }
    }
}

This traces 10% of dynamic requests while skipping health checks and static files. Trace IDs appear in access logs for correlation. For more NGINX performance tips, see our guide on tuning proxy_buffer_size.

Integrating with OpenTelemetry Collectors

The module exports traces via OTLP/gRPC. You need a collector at the endpoint. Here are common setups.

Jaeger

Run Jaeger with OTLP ingestion:

docker run -d --name jaeger \
    -p 4317:4317 \
    -p 16686:16686 \
    jaegertracing/all-in-one:latest

Point NGINX to Jaeger’s OTLP endpoint:

otel_exporter {
    endpoint localhost:4317;
}

Open the Jaeger UI at `http://localhost:16686` to view traces.

Grafana Tempo with OpenTelemetry Collector

For production, run the OTel Collector as an intermediary:

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  otlp:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]

The collector adds buffering, retry logic, and can fan out traces to multiple backends.

Performance Considerations

The NGINX OpenTelemetry module is built for production, but tracing has measurable impact:

  • CPU: 10–15% increase per worker when tracing is enabled
  • Memory: Up to batch_size × batch_count spans per worker (default ~2–4 MB)
  • Network: Batched OTLP/gRPC exports are efficient but add outbound traffic

Reducing Overhead

  1. Sample aggressively: Trace 1–10% of requests using split_clients
  2. Limit custom attributes: Each otel_span_attr adds cost
  3. Tune batching: Increase interval or decrease batch_size
  4. Skip static assets: Set otel_trace off on /static/ and /assets/

NGINX OpenTelemetry Module vs Angie Telemetry

Angie is an NGINX fork that markets itself on superior observability. How does its telemetry compare to the NGINX OpenTelemetry module?

The short answer: for distributed tracing, they are identical. Angie packages the exact same ngx_otel_module from the nginxinc/nginx-otel repository as its angie-module-otel. The directives, variables, and OTLP/gRPC export behavior are the same.

Where Angie genuinely differs is in built-in metrics and monitoring — features that complement tracing but serve a different purpose:

Feature NGINX + OTel Module Angie + OTel Module
Distributed tracing Same ngx_otel_module Same ngx_otel_module
W3C Trace Context Yes Yes
OTLP/gRPC export Yes Yes
Sampling strategies Yes (variables) Yes (variables)
Built-in JSON status API No (stub_status only) Yes (detailed per-server, per-upstream stats)
Native Prometheus export No (requires external exporter) Yes (built-in module)
Custom metric aggregation No Yes (metric zones with counters, histograms)
Web monitoring dashboard No Yes (Console Light)

Angie’s API, Prometheus, and Metric modules provide NGINX-Plus-level observability without a commercial license. However, these are metrics modules, not tracing enhancements. They answer questions like “how many 5xx errors per second?” and “what is the average response time for this upstream?”

The NGINX OpenTelemetry module answers a different question: “where did this specific request spend its time across all services?” Both are essential for production observability, but they solve distinct problems.

If you need both metrics and tracing, you can use NGINX with the OpenTelemetry module alongside an external Prometheus exporter like nginx-prometheus-exporter. Alternatively, the otel_span_attr directive with NGINX variables like $upstream_response_time can surface per-request performance data in your tracing backend.

Security Best Practices

Consider these points when deploying tracing in production:

  • Collector access: The default gRPC uses plaintext. Run your collector on localhost or in a private network
  • Sensitive data: Custom attributes may contain user IDs or tokens. Audit what you expose
  • Trace ID exposure: Do not expose $otel_trace_id in responses to end users. Use it in internal headers only:
proxy_set_header X-Trace-Id $otel_trace_id;

For more on securing your NGINX configuration, see our articles on NGINX security.

Troubleshooting

Module Not Loading

If nginx -t reports unknown directive "otel_exporter":

  1. Verify the module file exists:
    ls /usr/lib64/nginx/modules/ngx_otel_module.so
    
  2. Place load_module at the top of nginx.conf, before http:
    load_module modules/ngx_otel_module.so;
    
  3. Verify version compatibility:
    rpm -q nginx nginx-module-otel
    

No Traces in Collector

  1. Check the collector is running on the configured port:
    ss -tlnp | grep 4317
    
  2. Check NGINX error log for export issues:
    grep -i otel /var/log/nginx/error.log
    
  3. Verify tracing is on — the default is off. Set otel_trace on;

  4. Check context mode — using otel_trace $otel_parent_sampled with the default otel_trace_context ignore means no parent context is extracted. Set otel_trace_context extract or propagate first.

Spans Dropped

If the error log mentions dropped spans:

  • Increase batch_count for more buffer space
  • Decrease interval for faster flushing
  • Check collector health — a slow collector causes buildup

Comparison with Other Tracing Solutions

Feature ngx_otel_module OpenTracing module Lua-based tracing
Overhead 10–15% ~50% 30–50%
Maintained by F5/NGINX Inc. Archived Community
Protocol OTLP/gRPC Vendor-specific Varies
W3C Trace Context Yes No Depends
Dynamic sampling Yes (variables) Limited Yes
Configuration NGINX directives Directives + plugin Lua code

The NGINX OpenTelemetry module is the recommended choice for new deployments. The older OpenTracing module is archived and no longer maintained.

Conclusion

The NGINX OpenTelemetry module brings production-grade distributed tracing to NGINX with minimal overhead. For SREs and DevOps engineers running microservices, it closes a critical gap by making the reverse proxy visible in trace data.

Key takeaways:

  • Use otel_trace_context propagate to join NGINX spans with upstream traces
  • Sample aggressively to control data volume and overhead
  • Add $otel_trace_id to access logs for log-to-trace correlation
  • Use otel_resource_attr to tag all spans with environment metadata
  • Disable tracing on health checks and static assets

Install from the GetPageSpeed repository and explore the source on GitHub. The official NGINX documentation has the full directive reference.

D

Danila Vershinin

Founder & Lead Engineer

NGINX configuration and optimizationLinux system administrationWeb performance engineering

10+ years NGINX experience • Maintainer of GetPageSpeed RPM repository • Contributor to open-source NGINX modules

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.