Skip to content

fix(opentelemetry): validate x-request-id before using it as trace_id#12990

Open
prasunsrivastav123-lang wants to merge 14 commits intoapache:masterfrom
prasunsrivastav123-lang:fix-otel-invalid-trace-id
Open

fix(opentelemetry): validate x-request-id before using it as trace_id#12990
prasunsrivastav123-lang wants to merge 14 commits intoapache:masterfrom
prasunsrivastav123-lang:fix-otel-invalid-trace-id

Conversation

@prasunsrivastav123-lang
Copy link
Copy Markdown

@prasunsrivastav123-lang prasunsrivastav123-lang commented Feb 9, 2026

Description

This PR fixes a crash in the OpenTelemetry plugin when trace_id_source is set to x-request-id and the request ID is not a valid 32-hex trace ID (for example, a UUID).
The plugin now validates the header and gracefully falls back to the default trace ID generator instead of passing invalid values to the OpenTelemetry encoder.

Fixes #12976
Fixes #12976

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working labels Feb 9, 2026
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@indrekj @markokocic @huacnlee pls review this pr

@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @prasunsrivastav123-lang, please add a test case for this fix.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Feb 12, 2026
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

hey @Baoyuantop i added a test case pls review this pr thank you !

@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @prasunsrivastav123-lang, please fix failed CI

@Baoyuantop Baoyuantop added the wait for update wait for the author's response in this issue/PR label Feb 13, 2026
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

Hi @Baoyuantop I’ve added the requested test case to cover the invalid X-Request-Id scenario and ensured the route is properly recreated to avoid upstream issues.if any adjustment need pls tell .

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a crash in the OpenTelemetry plugin when trace_id_source is set to x-request-id and the request ID is not a valid 32-character hexadecimal trace ID (such as a UUID with hyphens). The plugin now validates the X-Request-Id header and gracefully falls back to the default trace ID generator instead of passing invalid values to the OpenTelemetry encoder, which was causing a Lua runtime error.

Changes:

  • Added is_valid_trace_id() function to validate that trace IDs are exactly 32 hexadecimal characters
  • Modified create_tracer_obj() to validate the X-Request-Id header before using it as a trace_id, with fallback to the original generator for invalid values
  • Added test case (TEST 21) to verify that invalid X-Request-Id headers (UUID format) don't cause crashes

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
apisix/plugins/opentelemetry.lua Added validation function and modified trace ID generation logic to validate and fallback for invalid request IDs
t/plugin/opentelemetry.t Added test case to verify invalid X-Request-Id doesn't crash the plugin

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

local function is_valid_trace_id(trace_id)
if not trace_id or #trace_id ~= 32 then
return false
end
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation function should reject trace IDs that are all zeros (00000000000000000000000000000000) according to the W3C Trace Context specification. Currently, the function only checks length and hex format, but an all-zeros trace_id is invalid and should be rejected to ensure proper fallback to the default generator.

Suggested change
end
end
-- W3C Trace Context: trace-id of all zeros is invalid
if trace_id == "00000000000000000000000000000000" then
return false
end

Copilot uses AI. Check for mistakes.
Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment on lines 240 to 253
if plugin_info.trace_id_source == "x-request-id" then
local original_new_ids = id_generator.new_ids

id_generator.new_ids = function()
local header_trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id

if is_valid_trace_id(header_trace_id) then
return header_trace_id, id_generator.new_span_id()
end

-- fallback to default generator for invalid values (e.g. UUID)
return original_new_ids()
end
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original_new_ids variable is saved inside the if block, which means it's saved every time create_tracer_obj is called when trace_id_source is 'x-request-id'. After the first call, id_generator.new_ids has already been overridden, so subsequent calls (e.g., after cache expiration after 24 hours) would save the already-overridden function, creating nested function wrappers. To fix this, save the original id_generator.new_ids once at module load time (outside the function), and check if it's already been overridden before overriding again, or restore it when trace_id_source changes.

Copilot uses AI. Check for mistakes.
Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment on lines +240 to +254
if plugin_info.trace_id_source == "x-request-id" then
local original_new_ids = id_generator.new_ids

id_generator.new_ids = function()
local header_trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id

if is_valid_trace_id(header_trace_id) then
return header_trace_id, id_generator.new_span_id()
end

-- fallback to default generator for invalid values (e.g. UUID)
return original_new_ids()
end
end
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation in the if statement and local variable declaration. The if statement uses 3 spaces while the local statement uses 2 spaces. According to the .editorconfig, the codebase uses space indentation consistently. Both lines should be indented with 4 spaces to match the function body indentation.

Suggested change
if plugin_info.trace_id_source == "x-request-id" then
local original_new_ids = id_generator.new_ids
id_generator.new_ids = function()
local header_trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id
if is_valid_trace_id(header_trace_id) then
return header_trace_id, id_generator.new_span_id()
end
-- fallback to default generator for invalid values (e.g. UUID)
return original_new_ids()
end
end
if plugin_info.trace_id_source == "x-request-id" then
local original_new_ids = id_generator.new_ids
id_generator.new_ids = function()
local header_trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id
if is_valid_trace_id(header_trace_id) then
return header_trace_id, id_generator.new_span_id()
end
-- fallback to default generator for invalid values (e.g. UUID)
return original_new_ids()
end
end

Copilot uses AI. Check for mistakes.
Comment thread t/plugin/opentelemetry.t
--- response_body
opentracing
--- no_error_log
[error]
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that an invalid X-Request-Id (UUID format) doesn't cause a crash, but it doesn't verify that the plugin correctly falls back to generating a valid trace ID. Consider adding a check in a follow-up test to verify that the trace appears in the OpenTelemetry collector logs with a valid trace_id, similar to TEST 16 which checks the log output.

Suggested change
[error]
[error]
=== TEST 22: invalid x-request-id should still generate a trace_id
--- request
GET /opentracing
--- more_headers
X-Request-Id: 550e8400-e29b-41d4-a716-446655440000
--- wait: 2
--- grep_error_log eval
qr/trace_id/
--- grep_error_log_out
trace_id

Copilot uses AI. Check for mistakes.
- Add is_valid_trace_id() to validate 32-char hex trace IDs
- Reject all-zero trace IDs per W3C spec
- Prevent multiple id_generator overrides
- Add tests for invalid X-Request-Id fallback behavior

Signed-off-by: Prasun Srivastav <prasunsrivastav123@gmail.com>
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Feb 13, 2026
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@Baoyuantop Addressed the review feedback:
Added validation for all-zero trace IDs
Prevented multiple id_generator overrides
Added a test to verify valid fallback trace generation
Kindly requesting a re-review. Thanks!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment on lines +253 to +268
if plugin_info.trace_id_source == "x-request-id"
and not id_generator_overridden then
id_generator.new_ids = function()

local header_trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id

if is_valid_trace_id(header_trace_id) then
return header_trace_id, id_generator.new_span_id()
end

-- fallback to default generator for invalid values (e.g. UUID)
return original_new_ids()
end
id_generator_overridden = true
end
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 253-254 have leading spaces instead of consistent indentation, and line 256 appears to have extra trailing spaces. This should be corrected to match the surrounding code style.

Copilot uses AI. Check for mistakes.
Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment on lines +253 to +268
if plugin_info.trace_id_source == "x-request-id"
and not id_generator_overridden then
id_generator.new_ids = function()

local header_trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id

if is_valid_trace_id(header_trace_id) then
return header_trace_id, id_generator.new_span_id()
end

-- fallback to default generator for invalid values (e.g. UUID)
return original_new_ids()
end
id_generator_overridden = true
end
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id_generator override is applied globally and permanently once trace_id_source is set to "x-request-id". If the plugin metadata is later updated to set trace_id_source to "random", the override persists and all requests will still go through the x-request-id validation logic before falling back to the original generator. This is inefficient and not the intended behavior. Consider either: 1) checking the current trace_id_source value inside the overridden function and calling original_new_ids directly when it's "random", or 2) implementing a mechanism to restore the original id_generator.new_ids when the metadata changes back to "random".

Copilot uses AI. Check for mistakes.
Comment thread t/plugin/opentelemetry.t Outdated
Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment thread apisix/plugins/opentelemetry.lua Outdated
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@Baoyuantop i just Removed the global flag and simplified the id_generator override as suggested.
Could you please recheck? Thanks!

@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@Baoyuantop can you review this

Baoyuantop
Baoyuantop previously approved these changes Mar 6, 2026
@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @prasunsrivastav123-lang, please fix failed CI

@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

Hi @Baoyuantop , I’ve fixed the CI issue by reindexing the tests and pushed the update. Could you please take another look? Thanks!

Copy link
Copy Markdown
Member

@moonming moonming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @prasunsrivastav123-lang, thank you for fixing the x-request-id validation in OpenTelemetry (#12976)!

Validating the x-request-id format before using it as a trace_id is the right approach — invalid formats could cause issues with tracing backends.

The implementation with a fallback to the default generator looks clean. Could you confirm that the tests cover:

  1. Valid 32-character hex trace IDs
  2. Invalid formats (too short, non-hex characters, empty string)
  3. Missing x-request-id header (should use default generator)

If all tests pass, this should be ready. Thank you!

@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

prasunsrivastav123-lang commented Mar 16, 2026

@moonming @Baoyuantop Thanks! Yes, I’ve added tests covering valid 32-character hex IDs, multiple invalid formats (UUID, all-zero, malformed), and the missing header fallback case. All tests are passing now. Please take another look.

Baoyuantop
Baoyuantop previously approved these changes Mar 18, 2026
@Baoyuantop Baoyuantop requested a review from moonming March 18, 2026 07:41
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@moonming if there need any other changes?

@Baoyuantop Baoyuantop requested a review from membphis April 1, 2026 09:32
Comment thread apisix/plugins/opentelemetry.lua Outdated
Comment on lines +252 to +262
local _original_new_ids = id_generator.new_ids

id_generator.new_ids = function()
local trace_id = core.request.headers()["x-request-id"] or ngx_var.request_id
return trace_id, id_generator.new_span_id()
local trace_id = core.request.headers()["x-request-id"]
or ngx_var.request_id

if is_valid_trace_id(trace_id) then
return trace_id, id_generator.new_span_id()
end

return _original_new_ids()
Copy link
Copy Markdown
Contributor

@AlinsRan AlinsRan Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lrucache has a 24-hour TTL After expiry, create_tracer_obj is called again. At that point id_generator.new_ids is already the closure from the previous call, so _original_new_ids captures the already-overridden function. Each 24h expiry adds another wrapper layer.

The pre-PR code was idempotent (it overwrote the same closure). This PR makes it strictly worse.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — the current implementation wraps id_generator.new_ids multiple times when the tracer is recreated, which breaks idempotency and can lead to stacked wrappers over time.

My intention was to validate the x-request-id and fallback safely, but I see that the current approach needs adjustment to avoid repeated wrapping.

I’ll update the implementation to ensure the override happens only once (idempotent behavior) and push a fix shortly

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @prasunsrivastav123-lang, has this comment been resolved?

@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @prasunsrivastav123-lang, there's a conflict in a code file. Could you help me resolve it?

@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

sure 😃

@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@Baoyuantop is it ok now ?

@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @prasunsrivastav123-lang, please fix the failed CI

@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@Baoyuantop Updated based on feedback. Added full validation and test coverage.

- Fix mixed indentation (0/3/4/8 spaces) in is_valid_trace_id() and
  create_tracer_obj() to use consistent 4-space indentation
- Restore master's original TEST 20-22 (additional_attributes with
  numeric nginx variables) that were corrupted during merge
- Renumber PR's new test cases as TEST 23-31 to avoid duplicate
  test numbers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @prasunsrivastav123-lang, there were some formatting issues when the code was merged; I’ve corrected them.

Replace direct access to Lua global 'string' with a local alias
'string_lower' to satisfy APISIX lint rules.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@prasunsrivastav123-lang
Copy link
Copy Markdown
Author

@Baoyuantop Thanks for fixing the formatting!

I’ve updated the implementation to ensure trace_id validation is handled cleanly:

  • Only valid 32-character hex trace IDs are used
  • Invalid formats (non-hex, incorrect length, all-zero) fall back to the default generator
  • Missing x-request-id also correctly falls back

Test cases have been added to cover all these scenarios.

Please let me know if any further changes are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: 👀 In review

Development

Successfully merging this pull request may close these issues.

bug: opentelemetry trace_id_source=x-request-id crashes with request-id plugin

5 participants