[SLES-2666] handle_end_invocation race condition by pablomartinezbernardo · Pull Request #1024 · DataDog/datadog-lambda-extension

pablomartinezbernardo · 2026-02-12T12:11:10Z

Overview

The first thing handle_end_invocation does is spawn a task, let's call it anonymousTask
handle_end_invocation then immediately returns 200 so the tracer can continue
anonymousTask is busy with a complex body in extract_request_body for the time being
Because the tracer has continued, eventually PlatformRuntimeDone is processed
Given our customer is not managed (initialization_type: SnapStart) then PlatformRuntimeDone tries to pair_platform_runtime_done_event which is None because anonymousTask is still busy with the body
We then jump to process_on_platform_runtime_done
Span and trace ids are not there yet, and they are never checked again after this
anonymousTask finally completes, but that's irrelevant because send_ctx_spans is only run on PlatformRuntimeDone which assumes universal_instrumentation_end has already been sent

Why this looks likely

In the customer's logs we can see

05:11:48.463 datadog.trace.agent.core.DDSpan - Finished span (WRITTEN): DDSpan [ t_id=2742542901019652192
05:11:48.489 PlatformRuntimeDone received
05:11:48.630 REPORT RequestId 1db22159-7200-43c8-bec1-11b89df4f099 (last log emitted in an execution)
05:11:53.784 START RequestId: 8c801767-e21b-43f7-bd11-078bb64bc430 (new request id, 5s later)
05:11:53.789 Received end invocation request from headers:{""x-datadog-trace-id"": ""2742542901019652192"... -> we are now trying to finish the span after the request is long gone 🙃

In this specific run, the lambda even had time to stop before continuin with the anonymous task from handle_end_invocation.

Performance

This PR makes the reading of the body synchronous with the response. This will delay handing over execution to outside the extension until the body is read. But that is irrelevant because it is a requirement to read the body and send universal_instrumentation_end before relinquishing control.

Testing

Suggestions very welcome

lucaspimentel · 2026-02-12T18:47:20Z

bottlecap/src/lifecycle/listener.rs

+            Ok(r) => r,
+            Err(e) => {
+                error!("Failed to extract request body: {e}");
+                return (StatusCode::OK, json!({}).to_string()).into_response();


handle_end_invocation returns StatusCode::OK on error but handle_start_invocation returns StatusCode::BAD_REQUEST (line 129). Should these be consistent?

If this was intentional, consider leaving a comment.

This was intentional as in this is the behavior that does not introduce regressions: handle_end_invocation was never able to return anything other than 200, and consumers of this endpoint may not be expecting something other than 200. But yeah, a comment explaining why (or better yet, making consumers aware that this may not return 200) is a good idea.

lucaspimentel · 2026-02-12T18:49:12Z

bottlecap/src/lifecycle/listener.rs

        State((invocation_processor_handle, _, tasks)): State<ListenerState>,
        request: Request,
    ) -> Response {
+        let (parts, body) = match extract_request_body(request).await {


Consider leaving a comment to explain why this should not be async (so somebody doesn't come and try to "optimize" it again later).

Something like:

extract_request_body must complete BEFORE returning 200 OK to avoid a race condition. See SLES-2666.

duncanista · 2026-02-19T21:06:24Z

Updating branch and creating an RC from it

extract_request_body before exiting handle_end_invocation

1b1f258

pablomartinezbernardo changed the title ~~[SLES-2666] extract_request_body before exiting handle_end_invocation~~ [SLES-2666] handle_end_invocation race condition Feb 12, 2026

lucaspimentel reviewed Feb 12, 2026

View reviewed changes

duncanista added 2 commits February 18, 2026 16:24

Merge branch 'main' into pmartinez/race-condition

20354f1

Merge branch 'main' into pmartinez/race-condition

d400b72

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[SLES-2666] handle_end_invocation race condition#1024

[SLES-2666] handle_end_invocation race condition#1024
pablomartinezbernardo wants to merge 3 commits intomainfrom
pmartinez/race-condition

pablomartinezbernardo commented Feb 12, 2026 •

edited

Loading

Uh oh!

lucaspimentel Feb 12, 2026 •

edited

Loading

Uh oh!

pablomartinezbernardo Feb 13, 2026

Uh oh!

lucaspimentel Feb 12, 2026 •

edited

Loading

Uh oh!

duncanista commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

pablomartinezbernardo commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Why this looks likely

Performance

Testing

Uh oh!

lucaspimentel Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pablomartinezbernardo Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

lucaspimentel Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

duncanista commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pablomartinezbernardo commented Feb 12, 2026 •

edited

Loading

lucaspimentel Feb 12, 2026 •

edited

Loading

lucaspimentel Feb 12, 2026 •

edited

Loading