-
-
Notifications
You must be signed in to change notification settings - Fork 0
Add reproduction for sentry-javascript#19367 #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
chargome
wants to merge
2
commits into
main
Choose a base branch
from
repro/sentry-javascript-19367
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,8 @@ node_modules/ | |
| dist/ | ||
| out/ | ||
| build/ | ||
| .next/ | ||
| next-env.d.ts | ||
| *.tsbuildinfo | ||
|
|
||
| # Environment files | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,227 @@ | ||
| # Reproduction for sentry-javascript#19367 | ||
|
|
||
| **Issue:** https://github.com/getsentry/sentry-javascript/issues/19367 | ||
|
|
||
| ## Description | ||
|
|
||
| Next.js 16 Turbopack bundles `@opentelemetry/api` into **separate chunks** for the | ||
| instrumentation hook (Sentry SDK init) and each route handler. This creates multiple | ||
| module instances that share the global context manager via `Symbol.for("opentelemetry.js.api.1")`, | ||
| but have separate `ContextAPI` singletons and module closures. | ||
|
|
||
| In `@sentry/nextjs` 10.38.0, `_startSpan()` nests 3 `context.with()` calls: | ||
| ``` | ||
| context.with(suppressedCtx) <- copy A's ContextAPI | ||
| -> tracer.startActiveSpan() | ||
| -> api.context.with(ctxWithSpan) <- copy B's ContextAPI | ||
| -> context.with(activeCtx) <- copy A's ContextAPI | ||
| -> callback | ||
| ``` | ||
|
|
||
| With duplicated module instances, the `SentryContextManager`'s scope cloning and | ||
| re-entry through the async context strategy can spiral into infinite recursion, | ||
| causing `RangeError: Maximum call stack size exceeded`. | ||
|
|
||
| This did not happen with `@sentry/nextjs` 10.8.0 because `startSpan()` only had | ||
| **1** `context.with()` call (inside `tracer.startActiveSpan()`), keeping the | ||
| re-entry depth bounded. | ||
|
|
||
| ## Steps to Reproduce | ||
|
|
||
| 1. Install dependencies: | ||
| ```bash | ||
| npm install | ||
| ``` | ||
|
|
||
| 2. Build with Turbopack (the default for Next.js 16): | ||
| ```bash | ||
| npm run build | ||
| ``` | ||
|
|
||
| 3. **Verify the duplication in build output:** | ||
| ```bash | ||
| npm run check-otel-dedup | ||
| ``` | ||
| This analyzes which chunks each route and the instrumentation hook load, | ||
| confirming they get **different** copies of `@opentelemetry/api`. | ||
|
|
||
| 4. Start the production server: | ||
| ```bash | ||
| npm start | ||
| ``` | ||
|
|
||
| 5. Test the diagnostic endpoint: | ||
| ```bash | ||
| curl http://localhost:3000/api/otel-check | python3 -m json.tool | ||
| ``` | ||
|
|
||
| 6. Test the realistic route with nested Sentry spans: | ||
| ```bash | ||
| curl http://localhost:3000/api/test | python3 -m json.tool | ||
| ``` | ||
|
|
||
| 7. Load test to attempt triggering the crash (intermittent): | ||
| ```bash | ||
| for i in $(seq 1 500); do curl -s http://localhost:3000/api/test > /dev/null & done; wait | ||
| ``` | ||
|
|
||
| ## What the check script shows | ||
|
|
||
| `npm run check-otel-dedup` maps the exact chunk loading for each entry point: | ||
|
|
||
| ``` | ||
| === @opentelemetry/api duplication analysis === | ||
|
|
||
| Instrumentation OTel chunks: | ||
| [root-of-the-server]__963c8309._.js (8KB) -- Symbol.for: true, ContextAPI: false | ||
| _0066dbbb._.js (4684KB) -- Symbol.for: false, ContextAPI: true | ||
|
|
||
| /api/test OTel chunks: | ||
| [root-of-the-server]__3f199e61._.js (9KB) -- Symbol.for: true, ContextAPI: false | ||
| [root-of-the-server]__8f25289d._.js (161KB) -- Symbol.for: false, ContextAPI: true | ||
|
|
||
| *** DUPLICATION DETECTED for /api/test *** | ||
| Instrumentation-only OTel chunks: __963c8309, _0066dbbb | ||
| Route-only OTel chunks: __3f199e61, __8f25289d | ||
| -> Different module instances with separate ContextAPI singletons | ||
| ``` | ||
|
|
||
| ## Routes | ||
|
|
||
| | Route | Purpose | | ||
| |---|---| | ||
| | `/api/otel-check` | Diagnostic: tests `context.with()` (single + 3-deep nested), reports global OTel registry state | | ||
| | `/api/test` | Realistic: nested `Sentry.startSpan()` calls simulating DB queries + external API calls | | ||
|
|
||
| ## Investigation Findings | ||
|
|
||
| ### Structural Evidence (Confirmed) | ||
|
|
||
| After `npm run build`, the `check-otel-dedup` script confirms that the instrumentation | ||
| hook and each route handler load **different chunks** containing `@opentelemetry/api`. | ||
| Every route gets its own copy. The instrumentation hook and route handlers never share | ||
| OTel chunks. | ||
|
|
||
| **Two ContextAPI class definitions in the same chunk:** The route handler chunk contains | ||
| two separate `ContextAPI` class definitions -- one Turbopack ESM-style (used by route | ||
| handler code) and one CJS-style (bundled from `@prisma/instrumentation`'s nested | ||
| `@opentelemetry/instrumentation` dependency). Both call `_getContextManager()` which | ||
| reads from `globalThis[Symbol.for("opentelemetry.js.api.1")]`, so they share the same | ||
| context manager but are separate JavaScript objects with separate closures, singletons, | ||
| and prototype chains. | ||
|
|
||
| **Version-independent:** The duplication occurs identically with both `@sentry/nextjs` | ||
| 10.8.0 and 10.38.0 (8 chunks with OTel `Symbol.for`, 2 with `ContextAPI` in both). | ||
|
|
||
| **`prismaIntegration()` makes no difference:** Removing `Sentry.prismaIntegration()` | ||
| produces identical chunk structure. The `@prisma/instrumentation` package is bundled | ||
| because it is a direct dependency of `@sentry/node`, not because of the integration call. | ||
|
|
||
| ### Runtime Behavior (Crash Not Reproduced) | ||
|
|
||
| | Test | Result | | ||
| |---|---| | ||
| | Single `context.with()` call | Passes | | ||
| | 3-deep nested `context.with()` (mimicking `_startSpan`) | Passes | | ||
| | Alternating Copy A / Copy B `context.with()` | Passes | | ||
| | 1000 concurrent mixed A/B `context.with()` calls | Passes | | ||
| | `Sentry.startSpan()` with nested spans + Copy B `context.with()` | Passes | | ||
| | 500 concurrent HTTP requests to `/api/test` | Server survives | | ||
| | `--stack-size=128` with concurrent load | Server survives | | ||
| | Monkey-patching instrumentation's `ContextAPI.with()` | Never triggered by routes (confirming separate singletons) | | ||
|
|
||
| **Why the crash doesn't reproduce locally:** The `SentryContextManager.with()` calls | ||
| `super.with()` which ends at `asyncLocalStorage.run(ctx, fn)` -- it does **not** call | ||
| back to any `ContextAPI.with()`. So there is no direct recursive loop between the two | ||
| ContextAPI singletons. | ||
|
|
||
| **Remaining hypotheses for the crash:** | ||
| - Node.js v24 runtime differences (different ALS behavior or stack limits) | ||
| - Actual Prisma DB queries creating deeper instrumentation chains | ||
| - pnpm workspace resolution causing different module nesting | ||
| - `next-intl` middleware adding wrapping to the Next.js config | ||
| - Sustained production traffic ("minutes to hours" under load) | ||
| - Duplicate copies of `require-in-the-middle`/`import-in-the-middle` both patching | ||
| the same modules, creating re-entrant wrapping chains | ||
|
|
||
| ### Turbopack's Behavior Is By Design | ||
|
|
||
| Per-route self-contained bundles are intentional. A Turbopack contributor (lukesandberg) | ||
| confirmed on [vercel/next.js#89192](https://github.com/vercel/next.js/issues/89192): | ||
|
|
||
| > "Bundling the module into multiple different output chunks is expected. The different | ||
| > routes may end up in different lambdas in production and so need to be self-contained." | ||
|
|
||
| However, the runtime deduplication layer that is supposed to compensate is not working | ||
| correctly for `@opentelemetry/api`. The Turbopack runtime uses a shared `moduleCache` | ||
| but the build output shows separate module definitions with different module IDs across | ||
| chunks. | ||
|
|
||
| **Known gaps:** | ||
| - [vercel/next.js#89192](https://github.com/vercel/next.js/issues/89192) -- Duplicate | ||
| class definitions across route chunks, breaking `instanceof` (same root cause) | ||
| - [vercel/next.js#89252](https://github.com/vercel/next.js/issues/89252) -- Related | ||
| chunking issue where unused CSS is included across routes | ||
| - Turbopack status page lists "Production Optimized JS Chunking" as still in progress | ||
| - Webpack does not have this problem (SplitChunksPlugin deduplicates shared deps) | ||
|
|
||
| ### The `serverExternalPackages` Workaround | ||
|
|
||
| Adding `@opentelemetry/api` to `serverExternalPackages` in `next.config.js` forces | ||
| Node.js native `require()` resolution at runtime, bypassing Turbopack's bundling: | ||
|
|
||
| ```js | ||
| const nextConfig = { | ||
| serverExternalPackages: ["@opentelemetry/api"], | ||
| }; | ||
| ``` | ||
|
|
||
| This guarantees a single module instance because Node.js `require()` caches modules. | ||
| `@opentelemetry/api` is **not** on Next.js's built-in auto-externalized packages list, | ||
| though `@sentry/profiling-node` and `import-in-the-middle` are. Adding it upstream | ||
| would be a reasonable fix. | ||
|
|
||
| ### OTel's Singleton Design vs Bundler Reality | ||
|
|
||
| `@opentelemetry/api` uses `Symbol.for("opentelemetry.js.api.1")` to store a global | ||
| registry on `globalThis`. This design was intended for the npm/yarn flat `node_modules` | ||
| world where multiple _versions_ might coexist. It was not designed for bundler-created | ||
| duplicate instances of the _same version_ in the same process. | ||
|
|
||
| **What the `Symbol.for` pattern protects:** global context manager registration, | ||
| global tracer provider registration, version compatibility checks. | ||
|
|
||
| **What it does NOT protect:** class identity (`instanceof` fails across copies), | ||
| module closures (each copy has its own), singleton instances (`ContextAPI.getInstance()` | ||
| returns a different object per copy), instrumentation registration (two copies may | ||
| both try to hook into the same Node.js modules). | ||
|
|
||
| ## Conclusions | ||
|
|
||
| **Confirmed:** | ||
| 1. Turbopack creates separate `@opentelemetry/api` module instances -- reproducible and deterministic | ||
| 2. This is a known Turbopack limitation -- runtime deduplication does not work correctly here | ||
| 3. Webpack does not have this problem | ||
| 4. The duplication is SDK-version-independent (10.8.0 and 10.38.0 identical) | ||
| 5. `prismaIntegration()` is irrelevant -- duplication comes from `@sentry/node`'s bundled `@prisma/instrumentation` | ||
|
|
||
| **Not confirmed:** | ||
| 1. The actual infinite recursion mechanism -- could not trigger `RangeError` in any test | ||
| 2. Why 10.38.0 crashes but 10.8.0 doesn't -- structural duplication is identical | ||
| 3. The exact conditions -- likely requires Node.js v24, real Prisma queries, pnpm, or sustained production traffic | ||
|
|
||
| **Recommended actions:** | ||
|
|
||
| | Action | Owner | Priority | | ||
| |---|---|---| | ||
| | Add `@opentelemetry/api` to `serverExternalPackages` | Users (workaround) | Immediate | | ||
| | Add `@opentelemetry/api` to Next.js built-in externals list | Next.js team | Short-term | | ||
| | Fix Turbopack runtime module deduplication | Turbopack team | Medium-term | | ||
| | Track [vercel/next.js#89192](https://github.com/vercel/next.js/issues/89192) | All | Ongoing | | ||
|
|
||
| ## Environment | ||
|
|
||
| - `@sentry/nextjs`: 10.38.0 | ||
| - `next`: 16.1.6 (Turbopack) | ||
| - `@prisma/instrumentation`: ^7.4.0 | ||
| - Node.js: v20+ (reporter uses v24) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| import { NextResponse } from "next/server"; | ||
| import { context, trace } from "@opentelemetry/api"; | ||
|
|
||
| /** | ||
| * Diagnostic route that proves @opentelemetry/api is duplicated at runtime. | ||
| * | ||
| * When Turbopack bundles this route, it gets its OWN copy of @opentelemetry/api | ||
| * (separate module instance from the one Sentry loaded in the instrumentation hook). | ||
| * Both copies share the same globalThis[Symbol.for("opentelemetry.js.api.1")] | ||
| * registry, but they are different JavaScript objects with different closures. | ||
| * | ||
| * This is the root cause of sentry-javascript#19367: the duplicated ContextAPI | ||
| * creates a re-entrant loop through the shared global context manager. | ||
| */ | ||
| export async function GET() { | ||
| const diagnostics: Record<string, unknown> = {}; | ||
|
|
||
| // 1. Check the global OTel registry (shared across all copies via Symbol.for) | ||
| const globalKey = Symbol.for("opentelemetry.js.api.1"); | ||
| const globalRegistry = (globalThis as any)[globalKey]; | ||
| diagnostics.globalOtelRegistry = { | ||
| exists: !!globalRegistry, | ||
| version: globalRegistry?.version, | ||
| hasContextManager: !!globalRegistry?.context, | ||
| hasTracerProvider: !!globalRegistry?.trace, | ||
| hasDiag: !!globalRegistry?.diag, | ||
| }; | ||
|
|
||
| // 2. Check the context manager registered by Sentry | ||
| const ctxManager = globalRegistry?.context; | ||
| if (ctxManager) { | ||
| diagnostics.contextManager = { | ||
| constructor: ctxManager.constructor.name, | ||
| hasWithMethod: typeof ctxManager.with === "function", | ||
| }; | ||
| } | ||
|
|
||
| // 3. Test context.with() — the method that causes the infinite recursion | ||
| let contextWithResult: string; | ||
| try { | ||
| const result = context.with(context.active(), () => { | ||
| return "context.with() succeeded (single call)"; | ||
| }); | ||
| contextWithResult = result; | ||
| } catch (err: any) { | ||
| contextWithResult = `CRASH: ${err.message}`; | ||
| } | ||
| diagnostics.singleContextWith = contextWithResult; | ||
|
|
||
| // 4. Test nested context.with() — mimics what _startSpan() does in SDK 10.38.0 | ||
| // | ||
| // In 10.38.0, _startSpan() wraps the call like: | ||
| // context.with(suppressedCtx, () => // call 1 | ||
| // tracer.startActiveSpan(name, ctx, span => // call 2 (inside startActiveSpan) | ||
| // context.with(activeCtx, () => // call 3 | ||
| // callback(span) | ||
| // ) | ||
| // ) | ||
| // ) | ||
| // | ||
| // With duplicated modules, each context.with() may go through a different | ||
| // ContextAPI singleton, and the Sentry context manager's scope cloning | ||
| // triggers re-entry through the async context strategy. | ||
| let nestedResult: string; | ||
| try { | ||
| context.with(context.active(), () => { | ||
| return context.with(context.active(), () => { | ||
| const tracer = trace.getTracer("repro-test"); | ||
| return tracer.startActiveSpan("nested-test", (span) => { | ||
| span.end(); | ||
| return "nested context.with() succeeded (3-deep, same as _startSpan in 10.38.0)"; | ||
| }); | ||
| }); | ||
| }); | ||
| nestedResult = "3-deep nested context.with() succeeded"; | ||
| } catch (err: any) { | ||
| nestedResult = `CRASH: ${err.message}`; | ||
| } | ||
| diagnostics.nestedContextWith = nestedResult; | ||
|
|
||
| // 5. The build-time evidence | ||
| diagnostics.buildTimeEvidence = | ||
| "Run `npm run check-otel-dedup` on the build output to confirm that " + | ||
| "the instrumentation hook and route handlers load DIFFERENT chunks " + | ||
| "containing @opentelemetry/api. Turbopack creates separate module " + | ||
| "instances instead of deduplicating into a single shared chunk."; | ||
|
|
||
| return NextResponse.json(diagnostics, { | ||
| status: contextWithResult.startsWith("CRASH") ? 500 : 200, | ||
| }); | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| import { NextResponse } from "next/server"; | ||
| import * as Sentry from "@sentry/nextjs"; | ||
|
|
||
| /** | ||
| * Simulates a realistic API route that creates nested Sentry spans. | ||
| * | ||
| * In production, auto-instrumentation (HTTP, Prisma, etc.) creates spans | ||
| * around every request and DB query. Each Sentry.startSpan() call in 10.38.0 | ||
| * internally calls context.with() THREE times (suppressed ctx → startActiveSpan | ||
| * → active ctx). With Turbopack's duplicate @opentelemetry/api modules, this | ||
| * 3-deep nesting × N nested spans can spiral into infinite recursion. | ||
| * | ||
| * The crash is intermittent because it depends on Node.js event loop timing | ||
| * and which module copy's ContextAPI handles each context.with() call. | ||
| */ | ||
| export async function GET() { | ||
| // Outer span — simulates the HTTP instrumentation auto-span | ||
| return Sentry.startSpan({ name: "GET /api/test" }, async () => { | ||
| // Inner span — simulates a DB query (e.g., Prisma) | ||
| const dbResult = await Sentry.startSpan( | ||
| { name: "prisma:query SELECT" }, | ||
| async () => { | ||
| await new Promise((resolve) => setTimeout(resolve, 2)); | ||
| return { users: 42 }; | ||
| } | ||
| ); | ||
|
|
||
| // Another inner span — simulates a second DB query | ||
| const cacheResult = await Sentry.startSpan( | ||
| { name: "redis:get session" }, | ||
| async () => { | ||
| await new Promise((resolve) => setTimeout(resolve, 1)); | ||
| return { cached: true }; | ||
| } | ||
| ); | ||
|
|
||
| // Deeply nested span — simulates calling an external API | ||
| const apiResult = await Sentry.startSpan( | ||
| { name: "http:POST /external-api" }, | ||
| async () => { | ||
| // Nested span inside the external call | ||
| return Sentry.startSpan( | ||
| { name: "serialize:response" }, | ||
| async () => { | ||
| await new Promise((resolve) => setTimeout(resolve, 1)); | ||
| return { ok: true }; | ||
| } | ||
| ); | ||
| } | ||
| ); | ||
|
|
||
| return NextResponse.json({ | ||
| status: "ok", | ||
| dbResult, | ||
| cacheResult, | ||
| apiResult, | ||
| timestamp: Date.now(), | ||
| }); | ||
| }); | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status code ignores the nested crash test result
Medium Severity
The HTTP status code is determined solely by
contextWithResult(the singlecontext.with()test), but ignoresnestedResult(the 3-deep nested test). The nested test is the one that mimics_startSpan()'s pattern from SDK 10.38.0 — the exact scenario that triggers the infinite recursion crash. If the simple test passes but the nested test crashes, the endpoint returns 200 instead of 500, making automated detection of the bug's primary failure mode impossible.Additional Locations (1)
sentry-javascript/19367/app/api/otel-check/route.ts#L63-L79