Skip to content

Verify deployment correctness across remaining ESP32 family chips after DIO bootloader fix #19

@zackees

Description

@zackees

Background

Commit 809fca0 (fix(deploy): force DIO bootloader, add verify-flash skip, harden daemon port recovery) changed how every ESP32-family chip resolves its bootloader flash mode and how BoardConfig reports flash_mode. The fix unblocked ESP32-S3 (which was watchdog-resetting in a loop with Saved PC pointing into ROM 0x400454d5 whenever the bootloader was built as QIO), but the change is global:

  • crates/fbuild-build/src/esp32/orchestrator.rs — second-stage bootloader is now forced to dio for every ESP32 chip whose app flash_mode != \"opi\".
  • crates/fbuild-config/src/board.rs — for any MCU starting with esp32, the JSON-shipped flash_mode is intentionally dropped; only an explicit env override (board_build.flash_mode = qio) survives. Downstream code falls back to mcu_config.default_flash_mode() (currently \"dio\" for the entire family).
  • crates/fbuild-deploy/src/esp32.rs — new try_verify_deployment() / build_verify_flash_args() skip re-flash when the candidate image already matches.

The only hardware-gated test (try_verify_deployment_real_esp32s3, --ignored) targets a single ESP32-S3 on COM13. We have zero physical confirmation that the other eight ESP32 MCUs still boot, flash, and verify correctly under the new pipeline.

Goal

Confirm that a clean build + deploy + verify cycle succeeds end-to-end on each ESP32-family MCU we ship a config for, and that verify-flash correctly returns Match immediately after a fresh deploy.

MCUs to cover

One representative dev board per MCU. Suggestions in parentheses are common boards from crates/fbuild-config/data/boards/ — substitute whatever hardware is on hand.

  • esp32 (e.g. esp32dev, ESP32-DevKitC)
  • esp32s2 (e.g. esp32-s2-saola-1)
  • esp32s3 ✅ already covered by try_verify_deployment_real_esp32s3 on COM13 — re-run as a regression baseline
  • esp32c2 (e.g. esp32-c2-devkitm-1)
  • esp32c3 (e.g. esp32-c3-devkitm-1)
  • esp32c5 (if hardware available — newer part, may be skipped)
  • esp32c6 (e.g. esp32-c6-devkitc-1, also covers the NightDriverStrip demo_c6 env)
  • esp32h2 (e.g. esp32-h2-devkitm-1)
  • esp32p4 (e.g. esp32-p4-function-ev) — only OPI-flash family member, exercises the app_flash_mode == \"opi\" branch in orchestrator.rs

Per-board checklist

For each MCU above:

  1. Clean build — wipe .fbuild cache for the target, run uv run cargo run -p fbuild-cli -- build <board> against the FastLED reference sketch.
  2. Inspect the bootloader BIN header — byte 0x02 of bootloader.bin encodes the flash mode (0x00=qio, 0x01=qout, 0x02=dio, 0x03=dout, 0x04=opi). Confirm it is 0x02 for every chip except esp32p4, which should be 0x04.
  3. Deployfbuild deploy <board> --port <COMx>. Watch for the watchdog-reset symptom (Saved PC in ROM, rst:0x10/rst:0x7, repeating boot banner). A healthy boot ends in the application's first serial output.
  4. Run try_verify_deployment against the just-flashed image — must return VerifyOutcome::Match in <15s. Add a #[test] #[ignore] mirror of try_verify_deployment_real_esp32s3 per chip if hardware is dedicated CI hardware.
  5. Tampered-image negative test — flip a byte in the middle of firmware.bin, re-run verify, must return VerifyOutcome::Mismatch (not Err).
  6. QIO opt-in regression — set board_build.flash_mode = qio in the env section, rebuild, confirm the application firmware.bin header byte 0x02 is 0x00 (qio) but bootloader.bin header byte 0x02 is still 0x02 (dio). This is the contract the fix establishes: app mode is user-controllable, bootloader mode is not (except for OPI).
  7. Daemon port recovery — pull and reinsert USB during a deploy; the daemon should recover and complete on retry (this exercises the port_recovery.rs test path on real hardware).

Acceptance criteria

  • Every checked MCU above either passes the seven steps or has a tracking sub-issue explaining why it's blocked (no hardware, broken board JSON, missing toolchain, etc.).
  • A new tests/esp32_family_verify.rs (or per-chip #[ignore] tests in crates/fbuild-deploy/src/esp32.rs) documents the COM port + reference firmware path for each chip we have on hand, so the regression suite can be re-run on demand.
  • If any MCU regresses, file a focused fix issue and reference it back here.

Why this matters

The commit message for 809fca0 calls out the symptom ("watchdog reset loop with Saved PC pointing into ROM 0x400454d5 on ESP32-S3") but the root cause — ROM bootloader can only fetch the second-stage bootloader in DIO/OPI — applies to every ESP32 chip. There is a real risk that:

  • An MCU we don't physically test was working only because its board JSON happened to ship flash_mode: dio and is now silently picking up the new code path, OR
  • An MCU was working with QIO because its specific flash chip honoured the QIE bit and we've now downgraded it to DIO without performance justification.

We need physical confirmation, not unit tests, to close this out.

References

  • 809fca0 — the fix being verified
  • crates/fbuild-build/src/esp32/orchestrator.rsBuildOrchestrator for Esp32Orchestrator block, bootloader ELF selection (bootloader_<mode>_<freq>.elf)
  • crates/fbuild-config/src/board.rsis_esp32_family flash_mode override logic in both from_board_id and the boards.txt parser path
  • crates/fbuild-deploy/src/esp32.rs::try_verify_deployment — verify-flash command builder + outcome enum
  • crates/fbuild-daemon/tests/port_recovery.rs — daemon-side recovery test that needs hardware coverage

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions