Skip to content

Add safe extraction for malformed base64 kafka headers#11472

Open
amarziali wants to merge 1 commit into
andrea.marziali/propagators-transformfrom
andrea.marziali/kafkaheaders
Open

Add safe extraction for malformed base64 kafka headers#11472
amarziali wants to merge 1 commit into
andrea.marziali/propagators-transformfrom
andrea.marziali/kafkaheaders

Conversation

@amarziali
Copy link
Copy Markdown
Contributor

What does this do

When DD_KAFKA_CLIENT_BASE64_DECODING_ENABLED=true, the Kafka producer instrumentation calls extractContextAndGetSpanContext on the outgoing ProducerRecord headers. If any header value is not valid Base64 (e.g. a header produced by a non-DD service, a URL-safe encoded value, or a plain string), Base64.getDecoder().decode() throws IllegalArgumentException. That exception escapes @OnMethodEnter, ByteBuddy's suppress = Throwable.class silently swallows it and returns a null AgentScope, which then causes an NPE in BaseDecorator.beforeFinish at scope.context()

The fix moves the Base64 decode into a safe Function<byte[], String> (Functions.base64Decode) that catches any Exception and returns null. TextMapExtractAdapter.forEachKey checks for null, logs the failing header key with EXCLUDE_TELEMETRY (since it's not actionable) , and skips that header. Valid headers continue to be processed normally.

Depends to #11466

Motivation

Additional Notes

Contributor Checklist

  • Format the title according to the contribution guidelines
  • Assign the type: and (comp: or inst:) labels in addition to any other useful labels
  • Avoid using close, fix, or any linking keywords when referencing an issue
    Use solves instead, and assign the PR milestone to the issue
  • Update the CODEOWNERS file on source file addition, migration, or deletion
  • Update public documentation with any new configuration flags or behaviors
  • Add your completed PR to the merge queue by commenting /merge. You can also:
    • Customize the commit message associated with the merge with /merge --commit-message "..."
    • Remove your PR from the merge queue with /merge -c
    • Skip all merge queue checks with /merge -f --reason "reason"; please use this judiciously, as some checks do not run at the PR-level
    • Get more information in this doc

Jira ticket: [PROJ-IDENT]

@amarziali amarziali requested review from a team as code owners May 27, 2026 15:45
@amarziali amarziali added the type: bug Bug report and fix label May 27, 2026
@amarziali amarziali requested a review from a team as a code owner May 27, 2026 15:45
@amarziali amarziali added inst: kafka Kafka instrumentation tag: telemetry error reported Reported by error telemetry labels May 27, 2026
@amarziali amarziali requested review from mcculls and ygree and removed request for a team May 27, 2026 15:45
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1435039fe3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@amarziali amarziali force-pushed the andrea.marziali/kafkaheaders branch from 1435039 to e90e93e Compare May 27, 2026 15:54
@datadog-prod-us1-3

This comment has been minimized.

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 27, 2026

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite Status
Startup 🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results

Startup Time

Scenario This PR master Change
insecure-bank / iast 14,025 ms 14,029 ms -0.0%
insecure-bank / tracing 12,919 ms 12,972 ms -0.4%
petclinic / appsec 17,137 ms 16,984 ms +0.9%
petclinic / iast 17,013 ms 17,125 ms -0.6%
petclinic / profiling 17,075 ms 16,970 ms +0.6%
petclinic / tracing 15,486 ms 16,500 ms -6.1%

Commit: e90e93e6 · CI Pipeline · Benchmarking Platform UI


Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Kafka / producer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch andrea.marziali/propagators-transform andrea.marziali/kafkaheaders
git_commit_date 1779875100 1779897233
git_commit_sha 94b73e5 e90e93e
See matching parameters
Baseline Candidate
ci_job_date 1779898363 1779898363
ci_job_id 1717035362 1717035362
ci_pipeline_id 115430674 115430674
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce same

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 27, 2026

Kafka / consumer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch andrea.marziali/propagators-transform andrea.marziali/kafkaheaders
git_commit_date 1779875100 1779897233
git_commit_sha 94b73e5 e90e93e
See matching parameters
Baseline Candidate
ci_job_date 1779898402 1779898402
ci_job_id 1717035364 1717035364
ci_pipeline_id 115430674 115430674
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume unsure
[+348.303op/s; +8513.285op/s] or [+0.184%; +4.495%]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: kafka Kafka instrumentation tag: telemetry error reported Reported by error telemetry type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants