Skip to content

[react-server][dev] Clean up pendingOperations on request close to prevent dev memory leak#35801

Open
alubbe wants to merge 3 commits intofacebook:mainfrom
alubbe:dev-mem-leak
Open

[react-server][dev] Clean up pendingOperations on request close to prevent dev memory leak#35801
alubbe wants to merge 3 commits intofacebook:mainfrom
alubbe:dev-mem-leak

Conversation

@alubbe
Copy link

@alubbe alubbe commented Feb 16, 2026

Summary

This PR fixes unbounded growth of async debug tracking state in RSC dev mode by making cleanup deterministic and owner-aware, while keeping the debugging experience much closer to the original behavior.

Context

ReactFlightServerConfigDebugNode stores async lineage in a module-level pendingOperations: Map<asyncId, AsyncSequence>.
Historically, entries were mostly removed by async_hooks.destroy, which is GC-timed and non-deterministic. In long-lived dev sessions, this can retain large amounts of async debug state and drive memory growth.

In the application we're developing, we see multiple GB of memory usage after 10-15 minutes. A typical heapdump comparison looks like this:
547850458-8162b042-ae5a-4289-9166-b11d6374a57a

Goals

  • Preserve async debug diagnostics for active renders, including cross-request/triggered async chains.
  • Deterministically release request-owned debug state at request end.
  • Avoid read-path rewrites and keep the fix focused on tracking/cleanup lifecycle.

Design

  • Add ownership tracking for async IDs:
    • requestAsyncIds: WeakMap<Request, Set<number>> (request -> async IDs)
    • asyncIdToRequests: Map<number, Set<Request>> (async ID -> owners)
    • cleanedRequests: WeakSet<Request> (prevents re-tracking after cleanup)
  • Keep pendingOperations globally tracked, then attach ownership:
    • If resolveRequest() exists: add request ownership.
    • If no request context: inherit owners from triggerAsyncId when applicable.
  • Guard init from creating new tracking nodes for requests already CLOSING/CLOSED or already cleaned.
  • Make cleanup owner-aware:
    • cleanupAsyncDebugInfo(request) marks request as cleaned, clears lastRanAwait, removes request ownership, and deletes a pendingOperations entry only when the last owner is removed.
  • Keep ownership maps in sync when async entries are removed:
    • In destroy(asyncId)
    • In markAsyncSequenceRootTask()
  • Call cleanup on terminal request paths in ReactFlightServer:
    • fatalError
    • flushCompletedChunks completion/close branches (including “main closed, debug stream still open”)
    • startFlowing / startFlowingDebug when CLOSING -> CLOSED
    • finishHalt, finishAbort, and abort fast-path with no abortable tasks

Behavior and Trade-offs

  • Request-end cleanup is deterministic and idempotent.
  • Shared async lineage across overlapping requests is preserved until the final owner closes.
  • Main stream is now marked CLOSED immediately when main output is done but debug output remains open.
  • Trade-off: bookkeeping is more complex (ownership graph), but bounded and explicit; truly unowned external async origins still follow baseline best-effort behavior.

Why this approach

  • Targets write-side tracking and lifecycle cleanup only.
  • Preserves existing read paths (pendingOperations.get(...) consumers).
  • Eliminates reliance on GC timing for correctness.

@meta-cla
Copy link

meta-cla bot commented Feb 16, 2026

Hi @alubbe!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@alubbe
Copy link
Author

alubbe commented Feb 18, 2026

Dear react team, is there anything else you require on this PR? I have also created a minimal reproduction (using nextjs) that you can use to verify the leak and the fix: https://github.com/alubbe/nextjs-dev-pending-operations-repro

@eps1lon
Copy link
Collaborator

eps1lon commented Feb 18, 2026

Thank you for opening this. We discussed this PR internally. We believe this fix is masking another memory leak. A heap snapshot only shows the reachable object i.e. excludes objects that are garbage-collectable. If you're still seeing objects in the pendingOperations in the heap-snapshot, they'd point at async debug info that's not garbage collectable. So the problem isn't relying on non-deterministic GC behavior for correctness but GC not applying at all.

What would help us is a repro for the memory leak. There's a possibility that we're referencing objects in the pending operations in a circular fashion that may prevents garbage collection. It's important to fix that root cause first.
Just saw the comment. I'll check out the repro.

The reason we're not merging this as-is is that we're not sure clearing pendingOperations after the request is the right timing. Frameworks may handle multiple requests with I/O that's shared between those requests so a blunt clearing after one request may cause relevant I/O of another request to get dropped.

@eps1lon
Copy link
Collaborator

eps1lon commented Feb 18, 2026

@alubbe Can you cleanup the repro to not include any manual memory tracking? I can use a memory profiler to validate. I want to make sure that the memory tracking isn't actually causing a leak.

@alubbe
Copy link
Author

alubbe commented Feb 18, 2026

Thanks for the update, I had no idea it got seen. I'm on my phone right now, but I can share two things:

  1. I'm using 3 forced GC cycles before recording the heap or dumping it to ensure it's as accurate as possible
  2. I also started with the profiler, but my approach automates it giving me a chance to quickly compare different approaches/patches or even have an LLM iterate on other potential approaches

But when I get back I will clean the repro up as you say, just wanted to reply asap

@alubbe
Copy link
Author

alubbe commented Feb 20, 2026

I've spent a few days trying to reproduce what we're seeing in our own app, and just couldn't. Here's what devtools shows - these async debug things just don't seem to get cleared up, but I don't know how to debug this further. Any tips?
Screenshot 2026-02-20 at 09 26 35

In trying to reproduce the above in https://github.com/alubbe/nextjs-dev-pending-operations-repro, I did however run into a few potential mem leaks. I say potential because maybe they are known/intended, or maybe my repro code is what leaks. Either way, I wanted to share them (tested with node 24.13.1 and pnpm 10.29.3):

Potential mem Leak 1

Here it seems as if memory usage grows just from calling the same action over and over, adding a lot of strings and compiled code to the heap. This may be intended for a dev server, though I don't understand why.

To reproduce: Go to branch mem-leak-1, run pnpm dev somewhere, connect dev tools, run pnpm repro:server-action once, wait a bit, force GC/dump heap, then run pnpm repro:server-action a few times, wait a bit, force GC/dump heap and you should see this:
Screenshot 2026-02-20 at 10 13 46

Potential mem Leak 2

Here we get a lot of retained async debug stuff, and it's also doesn't get freed up by waiting or forcing GC, but when we call another action, the old stuff does get cleared up, new stuff gets created and now that hangs around.

To reproduce: Go to branch mem-leak-2, run pnpm dev somewhere, connect dev tools, run pnpm repro:server-action once, wait a bit, force GC/dump heap (you will see around 60MB of non-GCable stuff), then run pnpm repro:server-action a few times, wait a bit, force GC/dump heap and you should see this (the old non-GCable stuff has disappeared, but 60MB of new stuff is now there):
Screenshot 2026-02-20 at 10 23 43

Potential mem Leak 3

Here we seem to get async debug leaks over time that don't get reclaimed. Might be the benchmark/repro code, but I can't tell.

To reproduce: Go to branch mem-leak-3, run pnpm dev somewhere, connect dev tools, run pnpm repro:server-action once, wait a bit, force GC/dump heap, then run pnpm repro:server-action a few times, wait a bit, force GC/dump heap and you should see a couple of additional MB of RAM usage - the more often you run pnpm repro:server-action, the more this grows:
Screenshot 2026-02-20 at 10 59 22

Conclusion

Please let me know what the best of proceeding is.

  • Should I open three new issues for the above, or leave it here? Do you need any more info?
  • Regarding the issue that we're facing on our application, is there any way of debugging of things further in the devtools to find out why the async debugging stuff hangs around forever?
  • I can offer to find time to pair with anyone knowledgable to explore this together, or run commands / enhance react's own logging / etc.

@alubbe
Copy link
Author

alubbe commented Feb 20, 2026

I forgot to mention: all 3 branches (and our issue in our application) were all also tested against production mode, i.e. replacing pnpm dev with pnpm build && NODE_OPTIONS='--inspect' pnpm start, and there was no memory growth whatsoever in any of those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments