Skip to content

perf(test): disable Lambda bundling in CDK unit-test synths (~15x per synth) (#366)#371

Merged
scottschreckengaust merged 9 commits into
mainfrom
feat/366-synth-reuse
Jun 17, 2026
Merged

perf(test): disable Lambda bundling in CDK unit-test synths (~15x per synth) (#366)#371
scottschreckengaust merged 9 commits into
mainfrom
feat/366-synth-reuse

Conversation

@scottschreckengaust

Copy link
Copy Markdown
Contributor

What

Disables Lambda asset bundling during CDK unit-test synthesis. Template.fromStack() does a full CDK synth, which bundles every NodejsFunction via esbuild — ~28s for the 413-resource AgentStack. Unit tests assert on CloudFormation structure/properties, not bundled Lambda code, so the bundling is pure overhead.

A jest setupFiles hook (cdk/test/setup/disable-bundling.ts) sets aws:cdk:bundling-stacks: [] via CDK_CONTEXT_JSON. A bare new App() — the dominant pattern across our tests (87 call sites) — picks this up with zero call-site changes.

Closes #366. Part of #363.

Results (8-core local)

Metric Before After
Single AgentStack synth 28.7s 1.9s (~15×)
github-tags.test.ts ~135s 9s
Full //cdk:test ~155s 25s (~6×)
  • All 2177 CDK tests pass. Coverage thresholds unchanged and met (lines 92.36 ≥ 91, branches 82.81 ≥ 82).
  • Real 4-core CI before/after will post on this PR's build run.

Why this over sharding (#365)

Profiling under #366 showed the CDK suite's cost is per-synth esbuild bundling, not test count or redundant synths. This single change beats 4-way sharding (~14%) and synth-dedup (~0%) by a wide margin, with no CI complexity. #365 (sharding) is likely now moot — to be re-evaluated once this lands and we see the new CI baseline.

Durable regression guards

  • AGENTS.md "Common mistakes": do not re-enable bundling in tests unless asserting on a bundled asset; minimize full-stack synths (synth once in beforeAll).
  • .abca/commands/review_pr.md: a Stage-3 checklist item so PR review catches regressions.

Safety / blast radius

  • Audited all 31 stack-synth suites — all green. The one asset-hash reference (test/bootstrap/version.test.ts) is a bootstrap policy hash, not a CDK asset hash, and synthesizes no stack — unaffected.
  • A test that genuinely needs bundled assets can opt out by setting aws:cdk:bundling-stacks: ['**'] in its own App context (documented in the setup file).

Dependency note

The AGENTS.md / review_pr.md links point to docs/design/CI_BUILD_PERFORMANCE.md, which is added by PR #364 (the design doc, currently draft). Merge #364 first (or together) so the links resolve.

Draft

Opened as draft while under review per WIP discipline. Will mark ready once the 4-core CI build posts green and #364 ordering is settled.


🤖 Generated with Claude Code

Template.fromStack() triggers a full CDK synth that bundles every
NodejsFunction via esbuild — ~28s for the 413-resource AgentStack. Unit
tests assert on CloudFormation structure, not bundled Lambda code, so the
bundling is pure overhead. A jest setupFiles hook sets
aws:cdk:bundling-stacks: [] via CDK_CONTEXT_JSON, which a bare new App()
picks up with no call-site changes.

Measured: single AgentStack synth 28.7s -> 1.9s (~15x); github-tags.test.ts
~135s -> 9s; full //cdk:test 155s -> 25s (8-core local). All 2177 tests
pass; coverage thresholds unchanged (lines 92.36 >= 91).

Durable regression guards added: AGENTS.md "Common mistakes" entry and a
.abca/commands/review_pr.md checklist item — do not re-enable bundling in
tests unless asserting on a bundled asset, and minimize full-stack synths
(synth once in beforeAll). Both link docs/design/CI_BUILD_PERFORMANCE.md
(PR #364).

Refs #366. Part of #363. Likely supersedes the sharding approach (#365) —
to be re-evaluated once this lands.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@isadeks isadeks left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genuinely good, well-scoped optimization. I verified the core mechanism empirically against aws-cdk-lib@2.257.0: bundlingRequired reads context key aws:cdk:bundling-stacks defaulting to ["**"], so setting it to [] makes [].some(...) return false and a bare new App() (the dominant test pattern) picks it up via CDK_CONTEXT_JSON with zero call-site changes. The ~15× synth speedup is real and the safety audit checks out.

✅ What I verified

  • Mechanism is correctnew App() reads CDK_CONTEXT_JSON; aws:cdk:bundling-stacks: [] disables bundling (bundlingRequired === false). Confirmed empirically.
  • Safety claim holdsversion.test.ts computes a bootstrap-policy hash and synthesizes no stack, so it's unaffected by disabling bundling.
  • No collateral — no existing CDK test asserts on bundled asset hashes / S3 keys, and none relies on a constructor-context bundling toggle, so the "all 2177 tests pass" claim is consistent.
  • Setup-file internal merge is self-consistent — when CDK_CONTEXT_JSON is pre-set, the existing value wins, matching the file's comment.

🚫 Blockers

None.

🔧 Should fix before merge

1. The documented per-test opt-out does not work (latent bug).
The escape hatch documented in all three places (disable-bundling.ts comment, AGENTS.md, review_pr.md) —

new App({ context: { 'aws:cdk:bundling-stacks': ['**'] } })

— does not re-enable bundling. Verified empirically against aws-cdk-lib@2.257.0:

App construction stack.bundlingRequired
bare new App() false (disabled ✅ — intended)
new App({ context: { 'aws:cdk:bundling-stacks': ['**'] } }) false ❌ (PR claims true)
new App({ postCliContext: { 'aws:cdk:bundling-stacks': ['**'] } }) true

Root cause: CDK's App.loadContext(props.context, props.postCliContext) sets props.context first, then overwrites it with CDK_CONTEXT_JSON, then applies postCliContext last. So the env var beats constructor context.

Impact: a future author who needs a real bundled asset (asset-hash / S3-key assertion) follows the docs, gets a silently-unbundled synth, and a wrong/empty asset hash with no error. Not a current-test failure — no test uses this path today — so it's latent.

Fix: document postCliContext as the opt-out (it actually overrides), and correct the "Tests that pass their own { context } still merge on top" comment, which is wrong for this key.

💬 Other (self-disclosed)

2. Dangling doc link. AGENTS.md./docs/design/CI_BUILD_PERFORMANCE.md doesn't exist on this branch or main (added by draft PR #364). The PR body already flags the merge-order dependency. Note it intersects with the link-checker PR #300: AGENTS.md is a root .md within that checker's scope, so landing #371 + #300 without #364 would turn the dead link into a red CI check. (review_pr.md has the same link but .abca/ is outside the checker's scope, so only AGENTS.md would trip CI.)


Reviewed at xhigh effort. Mechanism and precedence claims verified empirically against aws-cdk-lib@2.257.0, not just read from source.

@scottschreckengaust scottschreckengaust marked this pull request as draft June 17, 2026 13:29
scottschreckengaust and others added 3 commits June 17, 2026 13:55
The documented per-test opt-out used constructor `context`, which CDK
silently overwrites with CDK_CONTEXT_JSON in App.loadContext — so
`new App({ context: { 'aws:cdk:bundling-stacks': ['**'] } })` does NOT
re-enable bundling. Only `postCliContext` (applied last) overrides the
global disable.

Corrects the opt-out instruction and precedence comment in all three
documented locations (disable-bundling.ts, AGENTS.md, review_pr.md).
Verified empirically against aws-cdk-lib@2.257.0. Comment/doc-only;
runtime logic unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…366)

Disabling bundling does not stop tests from synthesizing —
`Template.fromStack()` still runs a full synth; it only skips the
esbuild asset-bundling step. The `postCliContext` opt-out exists solely
for the rare test that needs real bundled-asset output (asset hash /
S3 key), where an unbundled synth would silently yield a placeholder.
Makes this explicit in disable-bundling.ts and AGENTS.md. Doc-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The doc lives in draft PR #364 and exists on neither this branch nor
main, so the three links (disable-bundling.ts, AGENTS.md, review_pr.md)
are dead. Drop them and keep the stable #366 reference, removing the
merge-order dependency on #364 and the link-checker (#300) risk.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

@isadeks Thanks for the empirical verification — you're exactly right, and I confirmed it. I've corrected the opt-out to postCliContext when tests need the bundled asset and rewrote the precedence comment in all three documented locations.

@scottschreckengaust scottschreckengaust marked this pull request as ready for review June 17, 2026 14:33
@isadeks

isadeks commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Re-review: ✅ Both findings resolved — recommend approve

Re-reviewed at head 95ebf79. The three new commits map cleanly onto both findings from the prior pass:

Finding Fix Status
1. Broken opt-out — docs said use constructor context, which the env var clobbers 82831aa correct opt-out to postCliContext + c1872c0 clarify it's for asset output, not synth Fixed & verified empirically
2. Dead doc linkCI_BUILD_PERFORMANCE.md (draft #364) 95ebf79 drop dead links Fixed

Finding 1 — verified against aws-cdk-lib@2.257.0 with the setup file's env var active:

App construction stack.bundlingRequired
new App({ postCliContext: { 'aws:cdk:bundling-stacks': ['**'] } }) true ✅ (the documented opt-out — now works)
new App({ context: { 'aws:cdk:bundling-stacks': ['**'] } }) false ✅ (docs now correctly warn this does NOT work)

The corrected AGENTS.md / disable-bundling.ts prose is technically accurate: the loadContext(props.context, props.postCliContext) precedence explanation matches CDK's actual behavior (env var overwrites constructor context; postCliContext is applied last and wins). The added "this doesn't stop the synth, only skips the esbuild step" clarification is correct and heads off a likely misconception.

Finding 2 — zero CI_BUILD_PERFORMANCE.md references remain in any of the three touched files, so the hard dependency on #364 is severed. I also confirmed all 8 remaining .md links in the touched docs resolve on the PR head — no new dead links introduced.

Core mechanism, the version.test.ts safety claim, and the "no test relies on bundled-asset assertions" audit were confirmed in the prior pass and are unchanged.

Nothing left blocking. One cosmetic residual: the PR description's dependency note still mentions the #364 merge ordering, which is now stale since the links are gone — worth trimming the description, but it's not a code issue. Good to mark ready once the 4-core CI build posts green.

isadeks
isadeks previously approved these changes Jun 17, 2026

@theagenticguy theagenticguy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third-pass review — follow-up items

Independent pass after @isadeks's two reviews. The approve holds; the inline items are for a follow-up commit. I re-verified the load-bearing claims against the CDK version this branch actually resolves (aws-cdk-lib@2.259.0 per the lockfile — the prior pass verified 2.257.0; both satisfy ^2.257.0 and loadContext precedence is identical at 2.259.0).

integ-smoke is excluded as out of scope — its failure is an environmental Could not load credentials from any providers on the integ environment OIDC gate, unrelated to this diff.

Inline findings

  1. Add an executable regression test that bundling is off (highest value — it's the point of the PR). Guards today are honor-system docs.
  2. Cover the documented postCliContext opt-out — currently zero test coverage.
  3. Reframe "pure overhead" — bundling also smoke-tests Lambda entry resolution, which unit tests now cede.

Finding 4 (PR body, no line anchor)

Post the 4-core CI before/after. ~15× is single-synth on 8-core local and ~6× full-suite is also local; the figure that justifies this for everyone is 4-core CI wall-clock, which the description promises and hasn't posted yet. The mechanism earns the approve; the aggregate CI win is still unmeasured on CI hardware.

Confirmed during review (skip on re-check)

Claim Verification Result
All synth tests covered by the single setupFiles All 73 Template.fromStack sites live under cdk/ no orphans
Coverage thresholds unaffected Disabling bundling skips esbuild transpilation, never executes handler lines; coverage measures executed lines not gamed
No test asserts on bundled-asset output grep assetHash|S3Key|.zip across cdk/test/ → 0 hits; sole snapshot is the bootstrap-policy hash, which synthesizes no stack safety audit holds

Net: approve stands once #1 lands; #2#4 are polish and can ride the same commit.

Comment thread cdk/test/setup/disable-bundling.ts Outdated
Comment thread cdk/test/setup/disable-bundling.ts
Comment thread AGENTS.md Outdated
scottschreckengaust and others added 4 commits June 17, 2026 23:00
…366)

Two robustness fixes to the global bundling-disable setup file, both from
PR #371 review (finding 1 hardening nits):

- Spread BUNDLING_DISABLED_CONTEXT LAST when merging a pre-set
  CDK_CONTEXT_JSON, so an ambient `aws:cdk:bundling-stacks` value can no
  longer silently win and re-enable bundling. This is safe precisely because
  the documented per-test opt-out is `postCliContext` (applied after the env
  var regardless), so the escape hatch still works.
- Guard `JSON.parse` of the pre-set var with try/catch. A malformed value
  previously threw inside setupFiles and failed every test in every file with
  an opaque parse error; now the blame is localized to the offending value.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR #371 review finding 1: the ~15× synth speedup rested entirely on the
`disable-bundling.ts` setupFiles entry, but the only regression guards were
honor-system prose (AGENTS.md) and a review_pr.md checklist item. Dropping the
setupFiles wiring, reordering Jest setup, or a CDK rename of the context key
would silently revert the suite to full-bundling synths with no failing check.

Adds two assertions: the synth context carries `aws:cdk:bundling-stacks: []`,
and a bare-App stack reports `bundlingRequired === false`. Verified empirically
against aws-cdk-lib@2.259.0 that without the disable a bare App yields
`bundlingRequired === true`, so these discriminate a real regression.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR #371 review finding 2: the documented per-test opt-out (`postCliContext`)
and the `loadContext` precedence this PR corrected had zero test coverage, so a
future CDK version reordering context application could silently break the one
supported escape hatch.

Locks the contract with two assertions verified against aws-cdk-lib@2.259.0:
`postCliContext: { 'aws:cdk:bundling-stacks': ['**'] }` re-enables bundling
(`bundlingRequired === true`), while constructor `context` does NOT (the env
var set by the global disable clobbers it). Uses a bare `Stack` rather than a
heavy AgentStack since the precedence is a property of CDK context loading, not
of any particular stack.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…verhead" (#366)

PR #371 review finding 3: "pure overhead" undersold the tradeoff. esbuild
bundling also validates that every Lambda `entry` resolves and the handler
compiles, so disabling it in unit tests means a broken entry path or handler TS
error stops surfacing at `Template.fromStack()` and only fails at real
synth/deploy or in the `agent/` build. The tradeoff is still acceptable — real
synth/deploy bundles, and handler code has its own compile/test under `agent/`
— but the wording now names what unit tests cede. Updated in both places the
phrase lived: the disable-bundling.ts header and AGENTS.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@scottschreckengaust

scottschreckengaust commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up commits for @theagenticguy's third-pass review

Findings 1–3 addressed in four commits on feat/366-synth-reuse:

  • 94c604e fix(test) — harden the merge against ambient CDK_CONTEXT_JSON (spread the disable last; guard JSON.parse). (finding 1 hardening nits)
  • 09b1f24 test(cdk) — executable guard that the global disable is on. (finding 1)
  • 77b0c7a test(cdk) — cover the postCliContext opt-out precedence (+ the negative: constructor context does not work). (finding 2)
  • 3f6ff1f docs(test) — reframe "pure overhead" as overhead-plus-entry-smoke-test in both disable-bundling.ts and AGENTS.md. (finding 3)

Finding 4 — real CI numbers (now measured, not estimated)

Pulled from build workflow runs straddling the change, same runner, same 2177 tests — so the only variable is the bundling toggle:

Run Commit Bundling cdk:test (jest Time:) build step job wall-clock
27650386806 86fda643 (parent of perf commit) on 329.2s 376s 502s
27661401811 6cf7a984 (first branch commit) off 92.8s 140s 260s

Realized CI speedup: cdk:test 3.55× (329.2s → 92.8s); full build step 2.69× (376s → 140s); whole job 1.93× (502s → 260s).

Two corrections to the original framing:

  1. CI runs on ubuntu-latest (2-core), not 4-core — there's no DEFAULT_RUNNER_LABEL repo var and no ubuntu-latest-4-cores label on this PR (confirmed via the runs' runner labels). So the relevant "everyone" figure is the 2-core column above.
  2. The local ~6× full-suite figure doesn't fully transfer to fewer cores — on 2-core CI the realized cdk:test win is 3.55× (the fixed esbuild cost amortizes differently across core counts). Still well clear of the noise: a ~236s reduction on the test step and ~4 min off the build job's wall-clock, every PR.

For reference, the latest run with my four follow-up commits (27725228001, 3f6ff1f) holds at cdk:test 106.4s / build step 142s for 2193 tests — the +16 new tests add ~14s and the win is intact.

@theagenticguy theagenticguy left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve

Re-reviewed at head 3f6ff1ff. All three inline findings from my prior pass are addressed, and two of them went past what I asked. Verified each against the new commits, not the commit messages.

Finding Commit Verified
1. Executable bundling-off guard 09b1f24d disable-bundling.test.ts asserts aws:cdk:bundling-stacks === [] and probes bundlingRequired === false on a real stack — exceeds the ask
2. Opt-out coverage 77b0c7a9 Tests the positive postCliContext path (bundlingRequired === true) and a negative case proving constructor context does not work — locks the exact footgun the docs warn about
3. "Pure overhead" reframe 3f6ff1ff AGENTS.md now reads "overhead — plus a smoke-test of Lambda entry resolution and handler compilation that unit tests intentionally cede to the agent/ build." Honest framing
Hardening (the two optional nits) 94c604ed Merge spread flipped to { ...ambient, ...BUNDLING_DISABLED_CONTEXT } so the disable wins unconditionally, and JSON.parse is wrapped in a try/catch that throws a localized "malformed CDK_CONTEXT_JSON" error instead of failing every test opaquely

The precedence the opt-out test asserts matches aws-cdk-lib@2.259.0's loadContext (the version the lockfile resolves), which I read directly in the earlier pass — props.context first, CDK_CONTEXT_JSON overwrites, postCliContext last. build (agentcore) is green at this head, consistent with the new tests passing.

Two non-blocking residuals (not gating this approval)

  • Finding 4 — the 4-core CI number is still a placeholder in the PR description ("Real 4-core CI before/after will post"). It's the figure that justifies the change for everyone, so worth dropping the real before/after into the body once the build posts. Body-only, gated on the CI run — not a code issue.
  • integ-smoke is red and it's a required check, so it will block merge until cleared. The failure is environmental (Could not load credentials from any providers on the integ environment OIDC gate), unrelated to this diff — it needs the admin approval on the integ environment to go green, it won't clear on its own.

Code is correct and the regression floor is now machine-enforced. Approving.

@scottschreckengaust scottschreckengaust added this pull request to the merge queue Jun 17, 2026
Merged via the queue into main with commit 17eba2f Jun 17, 2026
7 of 8 checks passed
@scottschreckengaust scottschreckengaust deleted the feat/366-synth-reuse branch June 17, 2026 23:27
isadeks pushed a commit to isadeks/sample-autonomous-cloud-coding-agents that referenced this pull request Jun 18, 2026
The fork's main (isadeks/main) was 64 commits stale, making GitHub report
linear-vercel as '60 behind'. Against real upstream (aws-samples/main),
linear-vercel was only 5 commits behind. Merge those in to be fully current:
- ephemeral stack cleanup script (aws-samples#109)
- dependency upgrade (aws-samples#367)
- ADR governance: in-place refinement (aws-samples#382)
- test-perf: disable Lambda bundling in unit-test synths (aws-samples#366/aws-samples#371)
- jira: reactive token refresh + retry on 401 (aws-samples#370/aws-samples#375)

Clean merge, zero conflicts (linear-vercel's foundation was already current —
merge-base 0e2806a, 2026-06-17).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(test): disable Lambda bundling in CDK unit-test synths (~15x per synth) (#363)

3 participants