Skip to content

fix(#26): unify ProvenanceRecord/ProvenanceEntry#103

Merged
hyperpolymath merged 1 commit into
mainfrom
fix/26-dedup-provenance-type
May 16, 2026
Merged

fix(#26): unify ProvenanceRecord/ProvenanceEntry#103
hyperpolymath merged 1 commit into
mainfrom
fix/26-dedup-provenance-type

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

fix(#26): unify ProvenanceRecord/ProvenanceEntry

Canonical type: abi::ProvenanceEntry.

Rationale — abi::ProvenanceEntry is the richer, persistence-boundary
type: it carries the full chain API (genesis / chain / verify /
domain-tagged length-prefixed compute_hash), is the type the SQLite
write path (append_provenance / verify_chain) and the Idris2 ABI /
Zig FFI bridge are written against, and is the type the threat-model
doc names as the implementation. tier1::provenance::ProvenanceRecord
was a byte-for-byte duplicate struct (identical 8 fields) whose
compute_hash/verify had already been reduced to thin shims that
just delegated to ProvenanceEntry. It was orphaned — nothing in the
tree constructed it.

No field divergence: the two structs had identical fields and field
order, so this is a pure dedup, not a unification of diverged shapes.
No From/alias impls were needed.

Changes:

  • Deleted tier1::provenance::ProvenanceRecord struct + impl block
    (the duplicate compute_hash/verify shims).
  • Replaced with pub use crate::abi::ProvenanceEntry; so any external
    caller of tier1::provenance::* resolves to the single canonical
    definition. tier1/provenance.rs now holds only write-path logic
    (append_provenance, verify_chain, SIDECAR_DDL,
    init_sidecar_schema) — no type definitions.
  • Updated stale references in docs/architecture/TOPOLOGY.md and
    ROADMAP.adoc from ProvenanceRecord to ProvenanceEntry.

Refs migrated: 0 code call sites (the tier1 type was orphaned; only
its own self-references plus 2 documentation mentions). grep -r ProvenanceRecord src/ now returns zero hits.

On-disk / JSON stability: unchanged. ProvenanceEntry and the deleted
ProvenanceRecord had identical fields; the deleted shims already
computed the canonical hash via ProvenanceEntry::compute_hash. No
serde field names, SQL column names, hash preimage, or DB schema are
touched — provenance integrity is preserved bit-for-bit.

Build/test: cargo build clean (only the pre-existing unrelated
RetentionConfig unused-import warning in gc.rs). cargo test green:
87 lib + 9 integration + 2 sqlite-e2e tests, 0 failed. No
offline-cache failures.

Acceptance:

  • grep -r ProvenanceRecord src/ returns zero hits
  • cargo build clean, cargo test green
  • tier1/provenance.rs contains only write-path logic, no type defs

Unblocks #31/#32 (they touch the same provenance types — there is now
exactly one type and one compute_hash to evolve).

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com


🤖 Generated with Claude Code

Canonical type: `abi::ProvenanceEntry`.

Rationale — `abi::ProvenanceEntry` is the richer, persistence-boundary
type: it carries the full chain API (`genesis` / `chain` / `verify` /
domain-tagged length-prefixed `compute_hash`), is the type the SQLite
write path (`append_provenance` / `verify_chain`) and the Idris2 ABI /
Zig FFI bridge are written against, and is the type the threat-model
doc names as the implementation. `tier1::provenance::ProvenanceRecord`
was a byte-for-byte duplicate struct (identical 8 fields) whose
`compute_hash`/`verify` had already been reduced to thin shims that
just delegated to `ProvenanceEntry`. It was orphaned — nothing in the
tree constructed it.

No field divergence: the two structs had identical fields and field
order, so this is a pure dedup, not a unification of diverged shapes.
No `From`/alias impls were needed.

Changes:
- Deleted `tier1::provenance::ProvenanceRecord` struct + impl block
  (the duplicate `compute_hash`/`verify` shims).
- Replaced with `pub use crate::abi::ProvenanceEntry;` so any external
  caller of `tier1::provenance::*` resolves to the single canonical
  definition. `tier1/provenance.rs` now holds only write-path logic
  (`append_provenance`, `verify_chain`, `SIDECAR_DDL`,
  `init_sidecar_schema`) — no type definitions.
- Updated stale references in `docs/architecture/TOPOLOGY.md` and
  `ROADMAP.adoc` from `ProvenanceRecord` to `ProvenanceEntry`.

Refs migrated: 0 code call sites (the tier1 type was orphaned; only
its own self-references plus 2 documentation mentions). `grep -r
ProvenanceRecord src/` now returns zero hits.

On-disk / JSON stability: unchanged. `ProvenanceEntry` and the deleted
`ProvenanceRecord` had identical fields; the deleted shims already
computed the canonical hash via `ProvenanceEntry::compute_hash`. No
serde field names, SQL column names, hash preimage, or DB schema are
touched — provenance integrity is preserved bit-for-bit.

Build/test: `cargo build` clean (only the pre-existing unrelated
`RetentionConfig` unused-import warning in gc.rs). `cargo test` green:
87 lib + 9 integration + 2 sqlite-e2e tests, 0 failed. No
offline-cache failures.

Acceptance:
- [x] `grep -r ProvenanceRecord src/` returns zero hits
- [x] `cargo build` clean, `cargo test` green
- [x] `tier1/provenance.rs` contains only write-path logic, no type defs

Unblocks #31/#32 (they touch the same provenance types — there is now
exactly one type and one `compute_hash` to evolve).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hyperpolymath hyperpolymath merged commit bd84283 into main May 16, 2026
18 of 21 checks passed
@hyperpolymath hyperpolymath deleted the fix/26-dedup-provenance-type branch May 16, 2026 15:17
@hyperpolymath hyperpolymath mentioned this pull request May 16, 2026
16 tasks
hyperpolymath pushed a commit that referenced this pull request May 16, 2026
Resolve 12 conflicted files. main is canonical for the tested provenance
core (abi/mod.rs hash impl via #88/#46/#103, tier1/provenance.rs,
codegen/query.rs), CLI plumbing (main.rs: LONG_VERSION/build.rs,
doctor/gc/tier1, honest Start refusal #46, json status), and the
render_manifest_template design (manifest/mod.rs — superset with
retention/sidecar/version-from-Default + no-drift test). PR (HEAD) side
kept for the internally-consistent V-L2-tagged Tier-2 overlay DDL
(codegen/overlay.rs — superset retaining #41/#42 guarantees) and the
on_off-uniform status print. Doc/test comments aligned to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 16, 2026
#31 (per-entity write lock) and #32 (UNIQUE INDEX(entity_id,
previous_hash)) together design the provenance chain so that a forked
history *cannot be represented*. Issues #31/#32 frame forks as purely
adversarial; that is incomplete. Network-partitioned/replicated honest
writers and simulation branches (ADR-0006) are legitimate divergence.
A UNIQUE INDEX on (entity_id, previous_hash) does not *detect* a fork —
it *rejects the second row at insert time*, discarding honest history.
A fork that cannot be written cannot be detected or audited. That is
the integrity defect.

This is a planning skeleton — design doc + a failing-by-design test.
No implementation; the parent session controls merges.

Contents:

* docs/decisions/0010-provenance-forks-are-first-class.adoc — ADR
  (estate .adoc default). Decision: forks are first-class. Concretely:
  - #32: do NOT add UNIQUE INDEX(entity_id, previous_hash). The `hash`
    PRIMARY KEY (preimage is domain-tagged + covers every field, per
    ADR-0002) already rejects exact-duplicate rows — that is the
    correct duplicate guard. Add a NON-unique
    idx_provenance_predecessor(entity_id, previous_hash) for O(log n)
    fork detection.
  - #31: verisimdb_provenance_chain_head (entity_id PK, one head)
    becomes verisimdb_provenance_chain_heads (PK(entity_id,
    head_hash), a SET of tips). `append_provenance` keeps BEGIN
    IMMEDIATE (still serialises racing *duplicate* appends from one
    node) but removes parent-tip + adds new-tip on linear append; a
    new `append_provenance_fork(... from_hash ...)` adds a head
    without removing one.
  - Detection surface: `fork_points(conn, entity)`; `verify_chain`
    becomes per-branch (each head -> genesis walk hash-consistent),
    so divergence is never conflated with tampering.
  - Data migration: idempotent CREATE-IF-NOT-EXISTS + INSERT..SELECT
    copy of the head table guarded by a sqlite_master check; old
    table left for one release (no destructive step ships). The log
    table is unchanged. Because the unique index is never created,
    an existing sidecar that already contains a legitimate fork
    cannot fail to open — a hazard that WOULD exist had #32 shipped
    first (flagged in the #32 thread).

* tests/provenance_fork_test.rs — failing-by-design test. Writes
  genesis + branch A (supported linear path) + branch B (a second
  legitimate child of genesis). Asserts both children persist AND the
  entity records two heads. Compiles against the current public
  surface; the assertions, not the compile, fail. Verified red on
  this branch: child_count==2 passes (log keeps both rows) but
  head_count is 1 not 2 — the single-head table collapses branch B.
  Exactly the #31 defect, in executable form. (With #32's unique
  index applied, the branch-B insert would additionally fail with a
  constraint violation — also encoded in the ADR test plan.)

Build/test: lib builds clean (only the pre-existing unrelated
gc.rs RetentionConfig warning). New test compiles and FAILS as
intended (1 failed, by design).

Unblocked by #26 / PR #103 — there is now one ProvenanceEntry and one
compute_hash for the fork-aware append/verify to evolve.

Co-authored-by: hyperpolymath <hyperpolymath@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant