feat(cli): standalone per-shard e2e launcher by sarayev · Pull Request #14938 · aws-amplify/amplify-cli

sarayev · 2026-06-23T12:16:40Z

Description of changes

The e2e batch orchestrator (StartBuildBatch) becomes unreliable when a large
number of child builds are simultaneously in-progress. To run e2e reliably
without depending on the batch orchestrator, this adds a standalone per-shard
launcher that runs each e2e shard as an individual StartBuild (not
StartBuildBatch).

Every build it starts has buildBatchArn === null, so the batch orchestrator is
fully bypassed. Builds reuse the existing prep S3 cache keyed on the resolved
commit SHA, so no rebuild is required.

Standalone launcher

scripts/run-e2e-standalone.ts (+ yarn cloud-e2e-standalone) parses the shard
list from e2e_workflow_generated.yml, assigns each shard a CLI_REGION
round-robin from AWS_REGIONS_TO_RUN_TESTS, and runs Linux and Windows as two
independent concurrency-capped pools in parallel. Each build injects
TEST_SUITE and CLI_REGION overrides (the latter short-circuits
select-region-for-e2e-test.ts); Windows shards additionally get the
WINDOWS_IMAGE_2019 image and WINDOWS_SERVER_2022_CONTAINER env-type
overrides. The pool keeps at most --max-concurrency builds in flight, polling
BatchGetBuilds and topping up as builds finish; it prints a per-shard summary,
supports --retry-failed, and exits non-zero on any failure.

Because every build bypasses the batch orchestrator, the real ceiling is the
account concurrency quota (Linux/Medium 1,200; Windows/Medium 300), so
per-platform caps can be set well above the batch orchestrator's safe limit.

Duration-aware launch ordering

--durations <path> (default .e2e-shard-durations.json) and
--order longest-first|shortest-first|file (default longest-first) order
the launch queue by real per-shard wall-clock durations from a prior green
batch (LPT heuristic). The longest shards — the 8 l_gen2_migration_* shards
at 117–194 min — start in the first pool slots so they are never the makespan
tail. Shards missing from the dataset assume the median; if the file is absent,
ordering falls back to file order (no hard dependency). The dataset itself is
not committed.

Follow-up note for reviewers: this same duration dataset should later replace
the stale 2022 scripts/cci-test-timings.data.json used for split-balancing —
not changed in this PR.

How did you test these changes?

Validated a small bounded run against a cached prep from a prior batch:

yarn cloud-e2e-standalone --source-sha <resolvedSHA> \
  --platform linux --max-concurrency 25 --limit 20 --order longest-first

Confirmed on live builds:

buildBatchArn === null — batch orchestrator fully bypassed.
Prep cache reused: builds load the prep artifacts from the resolved-SHA S3
prefix ({repo,.cache,verdaccio-cache,all-binaries}).
CLI_REGION set per shard (round-robin) — no select-region error.
TEST_SUITE set to the shard's jest filter; builds reached the BUILD phase
running tests.
Simultaneous in-progress never exceeded the cap.
Longest-first ordering verified: all 8 l_gen2_migration_* shards launched
in the first slots.

Run the full suite when ready with, e.g.:

yarn cloud-e2e-standalone --source-sha <resolvedSHA> --max-concurrency 75

Projected makespan at cap-75 longest-first ≈ prep (~30m) + longest shard
(~194m) ≈ ~3.7h.

Checklist

PR description included
yarn test passes
Tests are [changed or added]
Relevant documentation is changed or added

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license.

Add scripts/run-e2e-standalone.ts (yarn cloud-e2e-standalone) that runs each e2e shard from e2e_workflow_generated.yml as an individual CodeBuild build via StartBuild instead of StartBuildBatch. Every build has buildBatchArn null, so it fully bypasses the batch orchestrator, which becomes unreliable when too many child builds are simultaneously in-progress. Because there is no batch, the batch orchestrator's simultaneous-in-progress limit does not apply; the only ceiling is the account concurrency quota (Linux/Medium 1200, Windows/Medium 300). Linux and Windows shards run as two independent, parallel concurrency pools with separate caps (--max-concurrency-linux / --max-concurrency-windows, default 75 each; --max-concurrency sets both). Each shard reuses the prep S3 cache keyed on the resolved commit SHA (--source-sha), injects TEST_SUITE and CLI_REGION (round-robin from AWS_REGIONS_TO_RUN_TESTS), and applies Windows image/environment-type overrides resolved from the project environment. Builds are polled with BatchGetBuilds; failures can be retried with --retry-failed. Validated with a small run of 20 Linux shards against a cached prep from a prior batch: all builds started with buildBatchArn null, reused the prep S3 cache, had CLI_REGION set with no select-region error, and proceeded into the BUILD phase executing jest. --- Prompt: Build a standalone per-shard e2e launcher that runs shards as individual CodeBuild builds with per-platform concurrency caps, bypassing the batch orchestrator, and validate it with a small Linux run.

Add `--durations <path>` (default .e2e-shard-durations.json) and `--order longest-first|shortest-first|file` (default longest-first) to the standalone launcher. When real per-shard durations from a prior green batch are available, the launch queue is sorted by duration descending (LPT heuristic) so the longest shards — the 8 l_gen2_migration_* shards at 117-194 min — start in the first pool slots and never become the makespan tail. Shards missing from the dataset assume the median; if the file is absent, ordering falls back to file order with no hard dependency. The dataset file itself is not committed. Tested by launching 20 linux shards longest-first against a cached prep from a prior batch: all 8 gen2-migration shards launched in the first slots, builds had buildBatchArn null, reused the S3 prep cache, had CLI_REGION set, and reached the BUILD phase running tests, never exceeding the concurrency cap. --- Prompt: FOLLOW-UP ENHANCEMENT: wire real duration ordering into the launcher. A per-shard duration dataset exists at .e2e-shard-durations.json (untracked) with shard_durations (identifier -> minutes). Add --durations and --order (default longest-first); sort the launch queue by duration DESC by default (LPT) so the 8 l_gen2_migration_* shards (117-194 min) start first. Shards missing -> median. If the file is absent, fall back to file order. Note in the PR body that this dataset should later replace the stale 2022 scripts/cci-test-timings.data.json (do NOT modify it here). Commit as a follow-up, conventional message, no --no-verify.

sarayev changed the title ~~feat(cli): standalone per-shard e2e launcher (cap-25 orchestrator bypass)~~ feat(cli): standalone per-shard e2e launcher Jun 23, 2026

sarayev added 2 commits June 23, 2026 13:00

sarayev force-pushed the feat/e2e-standalone-launcher branch from 8cf87d5 to 786b464 Compare June 23, 2026 13:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cli): standalone per-shard e2e launcher#14938

feat(cli): standalone per-shard e2e launcher#14938
sarayev wants to merge 2 commits into
devfrom
feat/e2e-standalone-launcher

sarayev commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sarayev commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Standalone launcher

Duration-aware launch ordering

How did you test these changes?

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sarayev commented Jun 23, 2026 •

edited

Loading