fix: confirm durable writes before advancing indexer state by damilolaedwards · Pull Request #784 · ethpandaops/dora

damilolaedwards · 2026-07-01T15:35:07Z

Summary

Three indexer paths mutated durable-facing state (a cursor, or the in-memory copy of data) before the corresponding database or blockdb write was confirmed, and only logged the write error. On a transient failure the data was then lost with no retry. Each is fixed by confirming the write first and propagating the error.

Transaction matcher drops a match on a persist error

runTransactionMatcher advanced MatchHeight unconditionally and discarded the persistMatches return value, returning only the state-persist result. On sqlite a failed statement does not abort the transaction, so the tx committed with the match missing but the height advanced; on postgres the tx rolled back but the in-memory height did not. Either way the failed range was never revisited and its request rows kept an empty tx hash. The matches and the advanced height are now persisted together in one transaction, and the in-memory height is rolled back if that transaction fails, so the range is retried.

Bid cache loses the flushed window on a db error

checkAndFlush and flushAll removed bids from the cache (and advanced minSlot) before calling InsertBids; on error they logged and returned with the bids already gone, and they cannot be re-added. The write now happens first, and the cached copies are dropped only after it succeeds.

Finalization can leave a block body unrecoverable on a blockdb write error

finalizeEpoch deletes the unfinalized_blocks rows in the committed finalization transaction, then writes the block bodies to the blockdb with only-logged errors, then advances lastFinalizedEpoch and frees the bodies from the cache regardless of the write result. If the blockdb write failed, the body then existed nowhere. On a blockdb write failure the epoch now returns without advancing, which routes it through the same resync path already used for a failed finalization transaction, leaving the cached bodies in place for the resync to rewrite.

Tests

Matcher: a failed persist rolls the height back and stores nothing; a successful one advances and stores it.
Bid cache: bids are kept when the db write fails and dropped only after it succeeds.

Both fail without their fix. The finalization change has no unit test (that path needs a full indexer) and reuses the existing resync behavior. go build, go vet and gofmt are clean.

Three indexer paths advanced a cursor or dropped their in-memory copy of data before confirming the durable write, and only logged the write error, so a transient failure lost the data with no retry. - The system-contract transaction matcher discarded the persistMatches error and advanced MatchHeight regardless, so a failed range was skipped and its request rows kept an empty tx hash forever. Persist the matches and the height together and roll the height back on error. - The bid cache removed bids from the cache before writing them, losing the flushed window on a transient db error. Write first, drop the cached copies only after the write succeeds. - Epoch finalization advanced lastFinalizedEpoch and freed the block bodies even when the blockdb write failed, after the unfinalized_blocks rows had already been deleted, leaving the bodies unrecoverable. Trigger a resync of the epoch on a blockdb write failure instead of advancing. Add tests for the matcher height rollback and the bid-cache write ordering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: confirm durable writes before advancing indexer state#784

fix: confirm durable writes before advancing indexer state#784
damilolaedwards wants to merge 1 commit into
ethpandaops:masterfrom
damilolaedwards:fix/durable-write-before-state-advance

damilolaedwards commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

damilolaedwards commented Jul 1, 2026

Summary

Transaction matcher drops a match on a persist error

Bid cache loses the flushed window on a db error

Finalization can leave a block body unrecoverable on a blockdb write error

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant