Skip to content

fix: stabilize Android Maestro replay interactions#805

Merged
thymikee merged 4 commits into
mainfrom
codex/maestro-replay-settling
Jun 14, 2026
Merged

fix: stabilize Android Maestro replay interactions#805
thymikee merged 4 commits into
mainfrom
codex/maestro-replay-settling

Conversation

@thymikee

@thymikee thymikee commented Jun 14, 2026

Copy link
Copy Markdown
Member

Summary

Improve Android Maestro replay stability for React Navigation CI flakes around tap-triggered transitions and horizontal tab swipes.

  • Route Android horizontal swipeScreen: direction left/right through content-lane coordinates instead of the native gesture preset.
  • Confirm Android native assertVisible wait successes with exact Maestro snapshot matching so Albums does not pass on Push albums.
  • Add failure-only Android recovery for immediate tap->assert and swipe->assert misses: retry the last relevant interaction once, cap the follow-up wait at 5s, and clear stale tap/swipe recovery state when another interaction supersedes it.
  • Dedupe the Android assertVisible recovery confirmation path, model recoverable tap/swipe state as one discriminated union slot, and relax provider-scenario snapshot-count assertions so tests cover behavior instead of exact internal capture counts.
  • Scope: 7 files, Maestro runtime/tests plus one Android provider scenario test; non-Maestro commands do not enter this runtime path.

Validation

Local checks:

  • ./node_modules/.bin/vitest run --project provider-integration passed: 19 files, 48 tests.
  • ./node_modules/.bin/vitest run src/compat/maestro/__tests__/runtime-interactions.test.ts src/compat/maestro/__tests__/runtime-assertions.test.ts src/daemon/handlers/__tests__/session-replay-vars.test.ts passed outside the sandbox: 3 files, 111 tests.
  • ./node_modules/.bin/oxfmt --check test/integration/provider-scenarios/android-test-suite.test.ts passed.
  • ./node_modules/.bin/oxlint . --deny-warnings passed.
  • ./node_modules/.bin/tsc -p tsconfig.json passed.
  • Earlier on this branch: pnpm build:all passed.
  • Local React Navigation Android full suite under CI-like Pixel 5 dimensions passed: 38/38, 0 retries, 420.1s.

Device/CI evidence:

Performance/scope note:

  • Normal passing Android Maestro assertions pay one extra exact snapshot confirmation after native wait success. Tap/swipe recovery only runs after assertion failure. Non-Maestro runs are performance-neutral because the changed runtime code is under src/compat/maestro/* and is only used by test --maestro replay.

@github-actions

github-actions Bot commented Jun 14, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.3 MB 1.3 MB +2.7 kB
JS gzip 413.6 kB 414.3 kB +725 B
npm tarball 547.9 kB 548.6 kB +709 B
npm unpacked 1.9 MB 1.9 MB +2.7 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 26.8 ms 26.1 ms -0.7 ms
CLI --help 54.0 ms 51.2 ms -2.8 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/session.js +2.7 kB +725 B

@thymikee thymikee force-pushed the codex/maestro-replay-settling branch 3 times, most recently from c17a48b to 6385eb7 Compare June 14, 2026 10:24
@thymikee thymikee force-pushed the codex/maestro-replay-settling branch from 6385eb7 to a518ea5 Compare June 14, 2026 10:40
@thymikee

Copy link
Copy Markdown
Member Author

Addressed the Android Maestro replay investigation with the branch now at a518ea564.

What changed:

  • Android horizontal Maestro screen swipes use content-lane coordinates.
  • Android native assertVisible successes are confirmed with exact Maestro snapshot matching, which prevents Albums from passing on Push albums.
  • Failure-only Android recovery now handles immediate tap->assert and swipe->assert misses, with stale tap/swipe recovery state cleared when another interaction supersedes it.

Validation:

  • Focused unit bundle passed: runtime-interactions.test.ts, runtime-assertions.test.ts, session-replay-vars.test.ts -> 110 tests.
  • oxfmt on touched files, oxlint --deny-warnings, tsc -p tsconfig.json, and pnpm build:all passed.
  • React Navigation focused Android CI probe passed all 3 targeted flaky flows first attempt: Native Stack - Card + Modal, Stack - Card + Modal, Tab View - Custom Tab Bar: https://github.com/react-navigation/react-navigation/actions/runs/27496272320

Residual risk:

  • Full 38-file React Navigation Android suite has not been rerun after the final patch, so I left this as draft rather than calling it merge-ready. The focused CI now covers the previously failing Native Stack and Tab View patterns cleanly.

Copy link
Copy Markdown
Member Author

Reviewed the change — design is sound (recovery state is consumed before any recursion, and tap/swipe remembers clear each other, so no double-recovery or retry loops). One non-regressive cleanup suggestion:

Dedupe the Android assertVisible recovery paths. retryRecentAndroidTapAfterVisibleMiss and retryRecentAndroidSwipeAfterVisibleMiss share a ~30-line identical tail (cap the follow-up wait, re-run native wait, return on success else fall back to single-snapshot), differing only in the retryTap/retrySwipe flag, and the two call sites both duplicate the tap-then-swipe sequence. Extracting a shared confirmVisibleAfterAndroidRecovery(params, args, recovery) and a recoverFromAndroidVisibleMiss(...) caller helper removes the duplication and the platform === 'android' guard repetition. With the file now at ~794 LOC (past the 500-LOC "extract before adding" guidance in AGENTS.md), this trims it by ~17 lines.

Behavior is unchanged — the recovery kind is preserved via a computed [recovery]: true key, so the existing tests pass without edits. I pushed the change to claude/pr-805-review-xy3ewd (stacked on this PR's head) for reference; cherry-pick if useful.

Validation on that branch: focused bundle 110/110, and oxfmt / tsc -p tsconfig.json / oxlint --deny-warnings all clean.

Also rename assertVisibleRetryTapTimeoutMsassertVisibleRetryTimeoutMs, since it now covers both tap and swipe recovery.

Nit (skip if not worth the churn): MaestroRetryTapTarget.point is typed { x: number; y: number } where the imported Point would do.


Generated by Claude Code

@thymikee

Copy link
Copy Markdown
Member Author

Addressed the cleanup review in commit 8aaaaa6df:\n\n- deduped Android assertVisible tap/swipe recovery through a shared recovery dispatcher and confirmation helper;\n- renamed the retry timeout helper from tap-specific to generic;\n- reused the shared Point type for retry tap coordinates.\n\nChecks after the change: focused Maestro replay bundle 110/110, oxfmt on the touched file, oxlint --deny-warnings, and tsc -p tsconfig.json all pass. I am rerunning the full React Navigation Android suite against this new head now.

@thymikee thymikee marked this pull request as ready for review June 14, 2026 11:36
@thymikee

Copy link
Copy Markdown
Member Author

Final validation update for 8aaaaa6df:\n\n- Addressed the cleanup review: shared Android assertVisible recovery confirmation for tap/swipe, renamed the retry timeout to generic naming, and reused the shared Point type.\n- Static/local checks still pass: focused Maestro replay bundle 110/110, oxfmt on touched file(s), oxlint --deny-warnings, tsc -p tsconfig.json, and pnpm build:all.\n- Local React Navigation Android full suite under CI-like Pixel 5 dimensions passed 38/38 with 0 retries in 420.1s. A wider local emulator run was 37/38 because Drawer - Master Detail can hit a layout-specific ambiguous Go back; the same drawer flow passes under Pixel 5 dimensions.\n- Full React Navigation Android CI probe passed: 38 passed, 0 failed, 1 flaky in 1335.5s: https://github.com/react-navigation/react-navigation/actions/runs/27496812112/job/81272293307\n- The one CI flaky case was Drawer - Master Detail, passing on attempt 3 after two Article tap misses. The originally targeted Native Stack/Card Modal, Stack/Card Modal, Material Top Tabs, and Tab View cases passed in the full CI run.\n\nScope/perf: changes remain under Maestro replay runtime/tests. Non-Maestro runs do not enter this code path. Passing Android Maestro assertions add one exact snapshot confirmation after native wait success; tap/swipe recovery is failure-only. I marked the PR ready for review with the drawer suite flake called out as residual risk.

@thymikee

Copy link
Copy Markdown
Member Author

Merge-readiness cleanup pushed in 64809fec0.

What changed:

  • Updated the Android provider-scenario tests that failed CI to assert snapshot-count ranges instead of exact internal capture counts. This matches the adjacent existing range assertion and keeps the behavior checks intact while allowing the extra exact-match snapshot introduced by Android Maestro assertion verification/recovery.
  • No runtime behavior changes in this commit. Branch scope is now 7 files: Maestro runtime/tests plus one provider scenario test.

Validation after the cleanup:

  • ./node_modules/.bin/vitest run --project provider-integration passed: 19 files, 48 tests.
  • Focused Maestro/runtime bundle passed outside the sandbox: 3 files, 110 tests.
  • ./node_modules/.bin/oxfmt --check test/integration/provider-scenarios/android-test-suite.test.ts passed.
  • ./node_modules/.bin/oxlint . --deny-warnings passed.
  • ./node_modules/.bin/tsc -p tsconfig.json passed.

Notes:

  • pnpm format could not run locally because the pnpm shim tried to verify/download pnpm 11.1.2 and the registry is unavailable in this sandbox; I ran the local oxfmt binary directly for the touched file.
  • A local full vitest run --coverage attempt progressed past the fixed provider assertions but timed out three unrelated Android fill tests under coverage. Those tests passed in the previous PR CI log, so I am relying on the restarted PR Coverage job for the final signal.

@thymikee

Copy link
Copy Markdown
Member Author

Zoom-out refactor pass pushed in 1b1c89b3c.

What I changed:

  • Replaced the separate recentTap and recentSwipe WeakMaps with one MaestroRecoverableInteraction discriminated union slot. That matches the runtime invariant better: only the last recoverable Maestro interaction should be eligible for assertion recovery.
  • Recovery now consumes one union value and dispatches on kind, so tap/swipe mutual exclusion is enforced by the data shape instead of by paired clear calls.
  • Successful fuzzy taps and percentage-point taps now clear the recoverable interaction slot, preventing an older tap from being retried after a newer non-recoverable tap path.
  • Added a regression test for stale tap recovery after a fuzzy tap.

Validation:

  • ./node_modules/.bin/vitest run src/compat/maestro/__tests__/runtime-interactions.test.ts src/compat/maestro/__tests__/runtime-assertions.test.ts src/daemon/handlers/__tests__/session-replay-vars.test.ts passed outside the sandbox: 3 files, 111 tests.
  • ./node_modules/.bin/vitest run --project provider-integration passed: 19 files, 48 tests.
  • ./node_modules/.bin/oxlint . --deny-warnings passed.
  • ./node_modules/.bin/tsc -p tsconfig.json passed.

I did not split a new module in this pass. The next architecture step, if we keep growing this area, should be extracting Android assertion recovery from runtime-assertions.ts into a focused runtime-recovery/runtime-android-recovery module.

@thymikee thymikee merged commit a6bb865 into main Jun 14, 2026
19 checks passed
@thymikee thymikee deleted the codex/maestro-replay-settling branch June 14, 2026 13:46
@github-actions

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-14 13:46 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant