NGINX OpenTelemetry Module: Distributed Tracing

Danila Vershinin

3 months ago

NGINX OpenTelemetry Module: Distributed Tracing for Production Servers

📅 Updated: June 7, 2026 (Originally published: March 17, 2026)

The NGINX OpenTelemetry module adds distributed tracing to your NGINX server. It makes NGINX a first-class participant in your OpenTelemetry pipeline. Every request gets a trace span with timing data, HTTP metadata, and context propagation headers — exported to your collector via OTLP/gRPC.

When a user reports that your application is slow, you need to know where time is spent. Is it the NGINX reverse proxy? The upstream app server? A database query three services deep? Without distributed tracing, answering this requires guesswork and log correlation across systems.

Unlike third-party tracing solutions that impose 50% or more overhead, this module is developed by F5/NGINX Inc. and optimized for NGINX internals. The performance impact is typically 10–15%, making it viable for production use. If you are looking for other ways to improve your NGINX server setup, distributed tracing is an excellent place to start.

How the NGINX OpenTelemetry Module Works

The module hooks into two phases of the NGINX request lifecycle:

Rewrite phase: Extracts trace context from incoming traceparent and tracestate headers (W3C Trace Context standard), creates a new span, and injects updated headers into upstream requests
Log phase: Collects HTTP attributes (method, status code, timing), finalizes the span, and adds it to an export batch

Spans are batched in memory and flushed to your collector at set intervals via OTLP/gRPC. The export runs in a background thread and does not block request processing.

Each NGINX worker process maintains its own batch buffer. This lock-free design avoids contention between workers and keeps overhead low under high concurrency.

Automatic Span Attributes

Every span includes these attributes per OpenTelemetry HTTP semantic conventions:

Attribute	Example Value	Description
`http.method`	`GET`	HTTP request method
`http.target`	`/api/users?page=2`	Request URI with query string
`http.route`	`/api/users`	Matched NGINX location
`http.scheme`	`https`	Request scheme
`http.flavor`	`2.0`	HTTP protocol version
`http.user_agent`	`curl/8.5.0`	User-Agent header
`http.status_code`	`200`	Response status code
`http.request_content_length`	`1024`	Request body size
`http.response_content_length`	`8192`	Response body size
`net.host.name`	`api.example.com`	Server name
`net.host.port`	`443`	Port (omitted if default)
`net.sock.peer.addr`	`192.168.1.100`	Client IP address
`net.sock.peer.port`	`52431`	Client port

Spans with HTTP status codes 500 or higher get an error status automatically.

Installation

RHEL, CentOS, AlmaLinux, Rocky Linux

Install the GetPageSpeed repository and the module:

sudo dnf install https://extras.getpagespeed.com/release-latest.rpm
sudo dnf install nginx-module-otel

Load the module by adding this line at the top of /etc/nginx/nginx.conf, before any http block:

load_module modules/ngx_otel_module.so;

Verify the module loads correctly:

nginx -t

If the test passes with otel_* directives in your config, the module is active.

For more details, see the RPM package page.

Configuration Directives

The NGINX OpenTelemetry module provides seven directives for controlling trace collection and export.

otel_exporter

Configures the collector endpoint and batching behavior.

Syntax: otel_exporter { ... }
Context: http
Required: Yes, if any otel_trace directive is enabled

This block accepts the following nested directives:

Directive	Default	Description
`endpoint`	(required)	OTLP/gRPC endpoint, e.g., `localhost:4317`
`interval`	`5s`	Max time between batch exports
`batch_size`	`512`	Max spans per batch per worker
`batch_count`	`4`	Pending batches per worker before drops

Example:

otel_exporter {
    endpoint localhost:4317;
    interval 5s;
    batch_size 512;
    batch_count 4;
}

The batch_size and batch_count values control memory use. Each worker holds up to batch_size × batch_count spans. With defaults, that is 2048 spans per worker. Spans exceeding this limit are dropped if the collector is slow.

otel_service_name

Sets the service.name resource attribute for your NGINX instance.

Syntax: otel_service_name name;
Default: unknown_service:nginx
Context: http

Example:

otel_service_name "api-gateway";

Choose a name that matches your infrastructure conventions. This name appears in Jaeger, Grafana Tempo, and other tracing backends as the primary filter.

otel_resource_attr

Adds custom resource-level attributes to all spans. Resource attributes describe the entity producing telemetry, not individual requests. Use them for environment, host, and deployment metadata.

Syntax: otel_resource_attr name value;
Default: —
Context: http

This directive is repeatable:

otel_service_name "api-gateway";
otel_resource_attr "deployment.environment" "production";
otel_resource_attr "host.name" "web-01";
otel_resource_attr "service.version" "2.4.1";

Resource attributes appear on every span from this NGINX instance. They are ideal for filtering traces by environment, region, or deployment version in your tracing backend.

otel_trace

Enables or disables tracing. This is the main switch.

Syntax: otel_trace on | off | $variable;
Default: off
Context: http, server, location

When set to a variable, tracing is active when the variable resolves to 1, on, or true. This enables per-request sampling decisions.

Trace all requests:

server {
    otel_trace on;

    location / {
        proxy_pass http://backend;
    }
}

Disable tracing for health checks:

location /health {
    otel_trace off;
    return 200 "OK";
}

otel_trace_context

Controls W3C Trace Context header propagation.

Syntax: otel_trace_context extract | inject | propagate | ignore;
Default: ignore
Context: http, server, location

Mode	Behavior
`ignore`	No trace context headers read or written
`extract`	Reads `traceparent`/`tracestate` from the client
`inject`	Writes trace context into upstream requests
`propagate`	Extract + inject (full participation)

For reverse proxy setups, use propagate so NGINX joins the distributed trace:

location /api/ {
    otel_trace on;
    otel_trace_context propagate;
    proxy_pass http://backend;
}

Use inject when NGINX is the trace origin with no incoming context from clients.

otel_span_name

Sets a custom span name. Defaults to the matched NGINX location.

Syntax: otel_span_name name;
Default: location name
Context: http, server, location

Supports NGINX variables for dynamic names:

location /api/ {
    otel_span_name "api-$request_method";
    otel_trace on;
    proxy_pass http://backend;
}

This produces names like api-GET and api-POST in your tracing UI.

otel_span_attr

Adds custom key-value attributes to the span. Repeatable.

Syntax: otel_span_attr name value;
Default: —
Context: http, server, location

location /api/ {
    otel_trace on;
    otel_trace_context propagate;
    otel_span_attr "deployment.environment" "production";
    otel_span_attr "request.id" $request_id;
    otel_span_attr "upstream.addr" $upstream_addr;
    otel_span_attr "upstream.response_time" $upstream_response_time;
    proxy_pass http://backend;
}

Adding $upstream_addr and $upstream_response_time lets you pinpoint slow backends in your tracing UI. These attributes are invaluable for debugging latency.

Note the difference: otel_span_attr adds per-request attributes to individual spans, while otel_resource_attr adds static metadata to all spans from this NGINX instance.

Embedded Variables

The module exposes four variables for use in configs:

Variable	Description	Example Value
`$otel_trace_id`	32-char hex trace identifier	`4bf92f3577b34da6a3ce929d0e0e4736`
`$otel_span_id`	16-char hex span identifier	`00f067aa0ba902b7`
`$otel_parent_id`	Parent span ID from incoming request	`dc94d281b0f884ea`
`$otel_parent_sampled`	Parent sampling flag (`1` or `0`)	`1`

These variables help correlate NGINX access logs with traces:

log_format traced '$remote_addr - [$time_local] "$request" $status '
                  'trace=$otel_trace_id span=$otel_span_id';

access_log /var/log/nginx/access.log traced;

Here is what these variables look like in actual log output:

127.0.0.1 - [17/Mar/2026:16:02:30 +0800] "GET / HTTP/1.1" 200 trace=68d4ba8f38c8bb01929f30445e930130 span=5d2792c4acabfd31

When a request arrives with a traceparent header, the parent context is extracted:

127.0.0.1 - [17/Mar/2026:16:02:30 +0800] "GET / HTTP/1.1" 200 trace=4bf92f3577b34da6a3ce929d0e0e4736 span=7197bdc8bf3adeae parent=00f067aa0ba902b7 sampled=1

Notice that the trace value matches the trace ID from the incoming traceparent header, confirming context propagation works.

You can also pass the trace ID to upstream services:

proxy_set_header X-Trace-Id $otel_trace_id;

Sampling Strategies

Tracing every request creates enormous data volumes. The NGINX OpenTelemetry module supports several sampling approaches.

Ratio-Based Sampling

Use NGINX’s split_clients to sample a percentage of requests:

split_clients "$otel_trace_id" $ratio_sampler {
    10%     on;
    *       off;
}

server {
    location / {
        otel_trace $ratio_sampler;
        otel_trace_context inject;
        proxy_pass http://backend;
    }
}

This traces about 10% of requests. The hash is based on $otel_trace_id, so decisions are consistent per trace.

Parent-Based Sampling

Honor the upstream caller’s sampling decision:

server {
    location / {
        otel_trace $otel_parent_sampled;
        otel_trace_context propagate;
        proxy_pass http://backend;
    }
}

This is ideal for microservice architectures. The gateway decides which requests to trace, and downstream services follow.

Combined Sampling

Trace sampled parents AND a percentage of unsampled traffic:

split_clients "$otel_trace_id" $ratio_sample {
    5%      on;
    *       off;
}

map "$otel_parent_sampled:$ratio_sample" $should_trace {
    "1:on"  on;
    "1:off" on;
    "0:on"  on;
    default off;
}

server {
    location / {
        otel_trace $should_trace;
        otel_trace_context propagate;
        proxy_pass http://backend;
    }
}

This traces all requests with a sampled parent, plus 5% of requests without one. In testing, this configuration correctly identified sampled parents and applied ratio-based fallback sampling to remaining traffic.

Complete Production Configuration

Here is a full config for a reverse proxy with OpenTelemetry tracing:

load_module modules/ngx_otel_module.so;

events {
    worker_connections 1024;
}

http {
    otel_exporter {
        endpoint localhost:4317;
        interval 5s;
        batch_size 512;
        batch_count 4;
    }

    otel_service_name "web-gateway";
    otel_resource_attr "deployment.environment" "production";
    otel_resource_attr "host.name" "web-01";

    split_clients "$otel_trace_id" $ratio_sampler {
        10%     on;
        *       off;
    }

    log_format traced '$remote_addr - [$time_local] "$request" $status '
                      '$body_bytes_sent "$http_referer" '
                      'trace=$otel_trace_id span=$otel_span_id '
                      'rt=$request_time';

    access_log /var/log/nginx/access.log traced;

    server {
        listen 80;
        server_name example.com;

        location / {
            otel_trace $ratio_sampler;
            otel_trace_context propagate;
            otel_span_attr "upstream.addr" $upstream_addr;
            proxy_pass http://127.0.0.1:8080;
        }

        location /health {
            otel_trace off;
            return 200 "OKn";
        }

        location /static/ {
            otel_trace off;
            root /var/www;
        }
    }
}

This traces 10% of dynamic requests while skipping health checks and static files. Trace IDs appear in access logs for correlation. For more NGINX performance tips, see our guide on tuning proxy_buffer_size.

Integrating with OpenTelemetry Collectors

The module exports traces via OTLP/gRPC. You need a collector at the endpoint. Here are common setups.

Jaeger

Run Jaeger with OTLP ingestion:

docker run -d --name jaeger 
    -p 4317:4317 
    -p 16686:16686 
    jaegertracing/all-in-one:latest

Point NGINX to Jaeger’s OTLP endpoint:

otel_exporter {
    endpoint localhost:4317;
}

Open the Jaeger UI at `http://localhost:16686` to view traces.

Grafana Tempo with OpenTelemetry Collector

For production, run the OTel Collector as an intermediary:

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  otlp:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]

The collector adds buffering, retry logic, and can fan out traces to multiple backends.

Performance Considerations

The NGINX OpenTelemetry module is built for production, but tracing has measurable impact:

CPU: 10–15% increase per worker when tracing is enabled
Memory: Up to batch_size × batch_count spans per worker (default ~2–4 MB)
Network: Batched OTLP/gRPC exports are efficient but add outbound traffic

Reducing Overhead

Sample aggressively: Trace 1–10% of requests using split_clients
Limit custom attributes: Each otel_span_attr adds cost
Tune batching: Increase interval or decrease batch_size
Skip static assets: Set otel_trace off on /static/ and /assets/

NGINX OpenTelemetry Module vs Angie Telemetry

Angie is an NGINX fork that markets itself on superior observability. How does its telemetry compare to the NGINX OpenTelemetry module?

The short answer: for distributed tracing, they are identical. Angie packages the exact same ngx_otel_module from the nginxinc/nginx-otel repository as its angie-module-otel. The directives, variables, and OTLP/gRPC export behavior are the same.

Where Angie genuinely differs is in built-in metrics and monitoring — features that complement tracing but serve a different purpose:

Feature	NGINX + OTel Module	Angie + OTel Module
Distributed tracing	Same `ngx_otel_module`	Same `ngx_otel_module`
W3C Trace Context	Yes	Yes
OTLP/gRPC export	Yes	Yes
Sampling strategies	Yes (variables)	Yes (variables)
Built-in JSON status API	No (`stub_status` only)	Yes (detailed per-server, per-upstream stats)
Native Prometheus export	No (requires external exporter)	Yes (built-in module)
Custom metric aggregation	No	Yes (metric zones with counters, histograms)
Web monitoring dashboard	No	Yes (Console Light)

Angie’s API, Prometheus, and Metric modules provide NGINX-Plus-level observability without a commercial license. However, these are metrics modules, not tracing enhancements. They answer questions like “how many 5xx errors per second?” and “what is the average response time for this upstream?”

The NGINX OpenTelemetry module answers a different question: “where did this specific request spend its time across all services?” Both are essential for production observability, but they solve distinct problems.

If you need both metrics and tracing, you can use NGINX with the OpenTelemetry module alongside an external Prometheus exporter like nginx-prometheus-exporter. Alternatively, the otel_span_attr directive with NGINX variables like $upstream_response_time can surface per-request performance data in your tracing backend.

Security Best Practices

Consider these points when deploying tracing in production:

Collector access: The default gRPC uses plaintext. Run your collector on localhost or in a private network
Sensitive data: Custom attributes may contain user IDs or tokens. Audit what you expose
Trace ID exposure: Do not expose $otel_trace_id in responses to end users. Use it in internal headers only:

proxy_set_header X-Trace-Id $otel_trace_id;

For more on securing your NGINX configuration, see our articles on NGINX security.

Troubleshooting

Module Not Loading

If nginx -t reports unknown directive "otel_exporter":

Verify the module file exists:

ls /usr/lib64/nginx/modules/ngx_otel_module.so

Place load_module at the top of nginx.conf, before http:
```
load_module modules/ngx_otel_module.so;
```
Verify version compatibility:
```
rpm -q nginx nginx-module-otel
```

No Traces in Collector

Check the collector is running on the configured port:
```
ss -tlnp | grep 4317
```
Check NGINX error log for export issues:
```
grep -i otel /var/log/nginx/error.log
```
Verify tracing is on — the default is off. Set otel_trace on;
Check context mode — using otel_trace $otel_parent_sampled with the default otel_trace_context ignore means no parent context is extracted. Set otel_trace_context extract or propagate first.

Spans Dropped

If the error log mentions dropped spans:

Increase batch_count for more buffer space
Decrease interval for faster flushing
Check collector health — a slow collector causes buildup

Comparison with Other Tracing Solutions

Feature	ngx_otel_module	OpenTracing module	Lua-based tracing
Overhead	10–15%	~50%	30–50%
Maintained by	F5/NGINX Inc.	Archived	Community
Protocol	OTLP/gRPC	Vendor-specific	Varies
W3C Trace Context	Yes	No	Depends
Dynamic sampling	Yes (variables)	Limited	Yes
Configuration	NGINX directives	Directives + plugin	Lua code

The NGINX OpenTelemetry module is the recommended choice for new deployments. The older OpenTracing module is archived and no longer maintained.

Conclusion

The NGINX OpenTelemetry module brings production-grade distributed tracing to NGINX with minimal overhead. For SREs and DevOps engineers running microservices, it closes a critical gap by making the reverse proxy visible in trace data.

Key takeaways:

Use otel_trace_context propagate to join NGINX spans with upstream traces
Sample aggressively to control data volume and overhead
Add $otel_trace_id to access logs for log-to-trace correlation
Use otel_resource_attr to tag all spans with environment metadata
Disable tracing on health checks and static assets

Install from the GetPageSpeed repository and explore the source on GitHub. The official NGINX documentation has the full directive reference.

Logs tell you what happened. Continuous gixy scanning tells you the config that produced them is still sane. GetPageSpeed Amplify runs scheduled gixy scans across every host and ties findings to live NGINX runtime metrics. Drop-in compatible with the deprecated nginx-amplify-agent (EOL January 2026).