From 4dcfb95670d9e44734ebc18a4f4d9ad598aacf5d Mon Sep 17 00:00:00 2001 From: Ralf Anton Beier Date: Sat, 27 Jun 2026 10:24:49 +0200 Subject: [PATCH] =?UTF-8?q?test(vcr-oracle):=20#509=20oracle-first=20?= =?UTF-8?q?=E2=80=94=20characterize=20value-returning-branch=20drop=20on?= =?UTF-8?q?=20the=20shipped=20path=20(#242,=20#509)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit No new external signal this issue-hunt pass (gates #418/#472/#396-emitter still await external triggers; no new issues/comments since watermark; deps unchanged). Bounded behavior-frozen #242 increment: land the oracle for the confirmed, still-open, SHIPPED-path #509 miscompile (distinct defect class from #518 — block-arity value-carry, not i64 params). #509: the DIRECT selector (so the --relocatable/shipped path) drops the CARRIED VALUE of a value-returning branch. Measured vs wasmtime under unicorn: - br_table value-carry (pick): dispatched arms drop the carry — pick(1) -> 2 vs wasmtime 12 (returns 0+2, the carried 10 lost). Broken on BOTH paths. - br_if value-carry (pick_brif): taken arm drops the carry — pick_brif(1) -> 0 vs 10 on the direct/shipped path. - ctrl (branchless value-returning block): correct on both paths — the discriminator proving the defect is the branch value-carry, not result-typed blocks in general. Richer than the issue stated: the DEFAULT (optimized) path handles br_if value-carry but still drops br_table value-carry, so the bug is non-uniform. Frozen-safe (no codegen touched; frozen anchors 3/3 green): - scripts/repro/br_value_509.wat + br_table_value_509_differential.py: stable characterization (EXPECT_MISCOMPILE; exit 0 = bug-as-documented). The gated block-arity fix flips it to the correctness gate. - Corrected a STALE over-claim: br_table_507_differential.py's header advertised a "value-returning" shape its FIXTURES never contained (only the two void-dispatch shapes) — now points at the #509 oracle for that case. - roadmap VCR-ORACLE-001 #509 entry: record the oracle + measurements. The byte-changing block-arity fix stays a separate gated step (decoder -> WasmOp::Block/Loop/If arity + result-register convention), not this idle tick. Co-Authored-By: Claude Opus 4.8 --- artifacts/verified-codegen-roadmap.yaml | 17 ++ scripts/repro/br_table_507_differential.py | 8 +- .../repro/br_table_value_509_differential.py | 176 ++++++++++++++++++ scripts/repro/br_value_509.wat | 38 ++++ 4 files changed, 237 insertions(+), 2 deletions(-) create mode 100644 scripts/repro/br_table_value_509_differential.py create mode 100644 scripts/repro/br_value_509.wat diff --git a/artifacts/verified-codegen-roadmap.yaml b/artifacts/verified-codegen-roadmap.yaml index 3380b79..c64cfa0 100644 --- a/artifacts/verified-codegen-roadmap.yaml +++ b/artifacts/verified-codegen-roadmap.yaml @@ -728,6 +728,23 @@ artifacts: sharpest evidence yet that the hand-written selectors lack the structured block-arity model a verified DSL would carry by construction. Cross-cutting, byte-changing, silicon-gated — never an idle-tick increment. + #509 ORACLE-FIRST (2026-06-27, frozen-safe, NOT yet fixed): the + characterization oracle now exists in-tree — + `scripts/repro/br_table_value_509_differential.py` + + `br_value_509.wat` (value-returning `br_table` `pick`, value-returning + `br_if` `pick_brif`, and a branchless value-returning `ctrl` control). Measured + vs wasmtime under unicorn on BOTH paths: the SHIPPED direct path drops the + carried value on every dispatched/taken arm (`pick(1)`→2 vs 12, `pick_brif(1)` + →0 vs 10) while the branchless `ctrl` stays correct — isolating the defect to + the branch value-carry, not result-typed blocks in general. A richer fact than + the issue stated: the DEFAULT (optimized) path handles `br_if` value-carry but + still drops `br_table` value-carry, so the bug is not uniform across paths. The + script is a stable characterization (exit 0 = bug-as-documented); the gated + block-arity fix flips `EXPECT_MISCOMPILE`→False and it becomes the correctness + gate. Also corrected a STALE over-claim: `br_table_507_differential.py`'s + header advertised a "value-returning" shape its `FIXTURES` never contained + (only the two void-dispatch shapes) — now points at the #509 oracle for the + carried-value case. Still byte-changing / background — its own gated step. #518 ORACLE-FIRST (2026-06-27, third arc member, NOT yet fixed): an i64 binop reading an i64 PARAM silently miscompiles on BOTH selectors — the same `func_58`-class i64-pair-register wall the #503 falcon correction named, now diff --git a/scripts/repro/br_table_507_differential.py b/scripts/repro/br_table_507_differential.py index 6bae663..2b37a4f 100644 --- a/scripts/repro/br_table_507_differential.py +++ b/scripts/repro/br_table_507_differential.py @@ -44,8 +44,12 @@ CODE, STK, RET = 0x100000, 0x900000, 0x300000 LIN_SIZE = 0x10000 # 1 page -# Three br_table shapes (sequential targets, shared-tail, value-returning) — the -# three gale reproduced. Each stores a selector-dependent marker to mem[0]. +# Two VOID-dispatch br_table shapes (sequential targets, shared-tail) — each +# stores a selector-dependent marker to mem[0]. The VALUE-RETURNING br_table +# shape (a carried operand to a result-typed block) is a DISTINCT, still-open +# defect (#509) covered by `br_table_value_509_differential.py`, not here: #507 +# is the void-dispatch decline-to-direct fix, and the carried-value case needs +# block-arity threading the #507 shortcut cannot provide. FIXTURES = { # sequential distinct targets "seq": """(module (memory (export "memory") 1) diff --git a/scripts/repro/br_table_value_509_differential.py b/scripts/repro/br_table_value_509_differential.py new file mode 100644 index 0000000..49c9b96 --- /dev/null +++ b/scripts/repro/br_table_value_509_differential.py @@ -0,0 +1,176 @@ +#!/usr/bin/env python3 +"""#509 (epic #242) — CHARACTERIZE the value-returning-branch miscompile. + +The oracle-first artifact for #509: the DIRECT selector (so the SHIPPED +`--relocatable` path) drops the CARRIED VALUE of a value-returning branch. A +`br`/`br_if`/`br_table` that targets a result-typed block and carries an operand +returns 0 (value lost) for the dispatched/taken arm; wasmtime keeps it. This does +NOT change codegen — it compiles the in-tree `br_value_509.wat`, runs each export +under unicorn (UC_ARCH_ARM / Thumb), and compares to wasmtime, on BOTH the direct +(`--relocatable`) and default paths. + +ROOT CAUSE (per #509 / the roadmap): the IR carries no block-result arity +(`WasmOp::Block` is a unit variant; the decoder tracks none), so the selector +cannot thread a carried value to the block join. The #507 decline-to-direct +shortcut does NOT apply (it cannot distinguish a CARRIED result from an UNWOUND +void-target value, and over-refuses valid void code). The real fix threads +block-result arity (decoder -> WasmOp::Block/Loop/If -> selector) plus a +result-register convention generalizing the #313 if/else reconciliation — +byte-changing, background-priority, its own gated step (NOT an idle tick). + +DISCRIMINATOR / non-vacuity: `ctrl` is a value-returning block with NO branch — +its value flows straight to the block end and is lowered correctly. So a failure +isolates to the branch value-carry, not result-typed blocks in general. While the +bug is live, `pick`/`pick_brif` diverge on the dispatched/taken arm and `ctrl` +stays correct; this script EXITS 0 when that characterization holds and 1 if +reality diverges from it (e.g. `ctrl` broke, or a value-carry case silently +started matching — which, without the real fix, would mean the harness went +blind). When the fix lands, flip EXPECT_MISCOMPILE->False and the same script +becomes the correctness gate (every path must then match wasmtime). + +Run (needs wasmtime + unicorn + pyelftools): + SYNTH=./target/debug/synth python scripts/repro/br_table_value_509_differential.py +""" +import os +import subprocess +import sys +from pathlib import Path + +import wasmtime +from elftools.elf.elffile import ELFFile +from unicorn import UC_ARCH_ARM, UC_MODE_THUMB, Uc, UcError +from unicorn.arm_const import ( + UC_ARM_REG_LR, + UC_ARM_REG_R0, + UC_ARM_REG_R11, + UC_ARM_REG_SP, +) + +WAT = Path(__file__).with_name("br_value_509.wat") +SYNTH = os.environ.get("SYNTH", "./target/debug/synth") +CODE, STK, RET = 0x100000, 0x900000, 0x300000 + +# Flip to False in the gated fix PR; the script then enforces correctness. +EXPECT_MISCOMPILE = True + +# (export, args) — the dispatched/taken arms are the value-carry cases. +CASES = [ + ("pick", [0, 1, 2, 5]), # br_table; sel>=1 dispatched arms drop the carry + ("pick_brif", [0, 1, 7]), # br_if; taken (arg!=0) arms drop the carry + ("ctrl", [0, 1, 7]), # branchless value block — control, must stay OK +] +VALUE_CARRY_FNS = {"pick", "pick_brif"} + + +def wasmtime_run(fn, arg): + engine = wasmtime.Engine() + module = wasmtime.Module.from_file(engine, str(WAT)) + store = wasmtime.Store(engine) + inst = wasmtime.Instance(store, module, []) + return inst.exports(store)[fn](store, arg) + + +def compile_synth(out, relocatable): + cmd = [SYNTH, "compile", str(WAT), "-o", out, "-b", "arm", + "--target", "cortex-m4", "--all-exports"] + if relocatable: + cmd.append("--relocatable") + r = subprocess.run(cmd, capture_output=True, text=True, + env={"PATH": "/usr/bin:/bin"}) + if r.returncode != 0 or "skipping" in r.stderr: + sys.exit(f"compile failed/skipped (reloc={relocatable}): {r.stderr}") + + +def load(elf): + f = ELFFile(open(elf, "rb")) + text = f.get_section_by_name(".text") + code, base = text.data(), text["sh_addr"] + syms = {} + for sec in f.iter_sections(): + if sec.header.sh_type == "SHT_SYMTAB": + for sy in sec.iter_symbols(): + if sy.name and sy["st_info"]["type"] == "STT_FUNC": + syms[sy.name] = sy["st_value"] + return code, base, syms + + +def unicorn_run(code, base, faddr, arg): + foff = (faddr & ~1) - base + mu = Uc(UC_ARCH_ARM, UC_MODE_THUMB) + mu.mem_map(CODE, 0x20000) + mu.mem_map(STK - 0x10000, 0x20000) + mu.mem_map(RET & ~0xFFF, 0x1000) + mu.mem_write(CODE, code) + mu.reg_write(UC_ARM_REG_SP, STK) + mu.reg_write(UC_ARM_REG_R11, 0x20000000) + mu.reg_write(UC_ARM_REG_R0, arg & 0xFFFFFFFF) + mu.reg_write(UC_ARM_REG_LR, RET | 1) + try: + mu.emu_start((CODE + foff) | 1, RET, count=100000) + except UcError as e: + return f"ERR:{e}" + return mu.reg_read(UC_ARM_REG_R0) & 0xFFFFFFFF + + +def run_path(label, relocatable): + out = f"/tmp/br509_{'rel' if relocatable else 'opt'}.o" + compile_synth(out, relocatable) + code, base, syms = load(out) + print(f"\n=== {label} path ===") + carry_div = 0 + ctrl_ok = True + for fn, args in CASES: + faddr = syms.get(fn) + if faddr is None: + print(f" {fn}: SYMBOL MISSING") + if fn in VALUE_CARRY_FNS: + carry_div += 1 + else: + ctrl_ok = False + continue + for arg in args: + exp = wasmtime_run(fn, arg) + got = unicorn_run(code, base, faddr, arg) + match = isinstance(got, int) and got == exp + if not match: + if fn in VALUE_CARRY_FNS: + carry_div += 1 + else: + ctrl_ok = False + print(f" [{'ok ' if match else 'BUG'}] {fn}({arg}) -> {got} " + f"(wasmtime {exp})" + + ("" if match else " [carried value dropped]")) + return carry_div, ctrl_ok + + +def main(): + rel_div, rel_ctrl = run_path("DIRECT (--relocatable / shipped)", relocatable=True) + opt_div, opt_ctrl = run_path("DEFAULT (optimized)", relocatable=False) + + print("\n--- characterization verdict ---") + if EXPECT_MISCOMPILE: + ok = rel_div > 0 and rel_ctrl and opt_ctrl + print(f"direct value-carry divergences={rel_div} (ctrl_ok={rel_ctrl}) " + f"default ctrl_ok={opt_ctrl} (value-carry div={opt_div})") + if ok: + print("RESULT: PASS — #509 reproduces: a value carried over a branch to " + "a result-typed block is dropped on the SHIPPED direct path, while " + "the branchless control stays correct. Oracle ready for the gated " + "block-arity fix.") + sys.exit(0) + print("RESULT: FAIL — reality diverged from the #509 root cause (the " + "branchless control broke, or value-carry stopped miscompiling on the " + "direct path). Re-examine before trusting the fix spec.") + sys.exit(1) + else: + total = rel_div + opt_div + (0 if rel_ctrl else 1) + (0 if opt_ctrl else 1) + if total == 0: + print("RESULT: PASS — every value-returning branch now matches wasmtime " + "on both paths. #509 fixed.") + sys.exit(0) + print(f"RESULT: FAIL — {total} residual divergence(s); #509 not fully fixed.") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/scripts/repro/br_value_509.wat b/scripts/repro/br_value_509.wat new file mode 100644 index 0000000..02a8fdb --- /dev/null +++ b/scripts/repro/br_value_509.wat @@ -0,0 +1,38 @@ +;; #509 — the DIRECT selector (so the SHIPPED `--relocatable` path) drops the +;; CARRIED VALUE of a value-returning branch: a `br`/`br_if`/`br_table` that +;; targets a result-typed block and carries an operand returns 0 (the value is +;; lost) for the dispatched/taken arm. wasmtime keeps it. The IR carries no +;; block-result arity (`WasmOp::Block` is a unit variant, the decoder tracks +;; none), so the selector cannot thread the carried value to the join — the real +;; fix needs block-arity threading + a result-register convention generalizing +;; the #313 if/else reconciliation (NOT the #507 decline-to-direct shortcut, +;; which cannot tell a CARRIED result from an UNWOUND void-target value). +;; +;; This is the discriminator: a value carried OVER A BRANCH is dropped, but a +;; branchless value-returning block (`ctrl`) is correct — so the defect is the +;; branch value-carry, not result-typed blocks in general. +(module + ;; value-returning br_table: the carried `i32.const 10` should reach whichever + ;; result-typed block the selector dispatches to. sel 0 -> inner end (+1 then + ;; +2 = carried+3); sel>=1 -> outer end (carried+2). + (func (export "pick") (param i32) (result i32) + (block (result i32) + (block (result i32) + (i32.const 10) + (br_table 0 1 (local.get 0))) + (i32.const 1)(i32.add)) + (i32.const 2)(i32.add)) + + ;; value-returning br_if: taken (p != 0) -> block result is the carried 10; + ;; not-taken (p == 0) -> 20. + (func (export "pick_brif") (param i32) (result i32) + (block (result i32) + (i32.const 10) + (br_if 0 (local.get 0)) + (drop)(i32.const 20))) + + ;; CONTROL: a value-returning block with NO branch — the value flows straight + ;; to the block end, so it is lowered correctly. Proves the defect is the + ;; branch value-carry specifically (non-vacuity). + (func (export "ctrl") (param i32) (result i32) + (block (result i32) (local.get 0)(i32.const 5)(i32.add))))