Fix fill skipping old rows when dest has mirrored data#22
Merged
Conversation
When enable_mirroring runs before fill, the mirroring trigger copies recent rows into the intermediate table. Fill without --start called destTable.maxId() which returned a high ID from the mirrored row, causing WHERE id > <high_id> to find nothing and skip all old rows. - Start from sourceTable.minId() instead of destTable.maxId() - Restructure batch CTE to track source progress independently, preventing early termination on all-duplicate batches - Fix minId() to order by PK only (not time+PK) for non-monotonic PKs - Add endId to CLI batch log output for resume support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This avoids 30s timeouts on large table startup scans and documents resume guidance. Co-authored-by: Cursor <cursoragent@cursor.com>
8049136 to
7ec7ab8
Compare
Let fill and synchronize minId lookups use the session timeout and drop the README note about temporarily raising statement_timeout. Co-authored-by: Cursor <cursoragent@cursor.com>
mthadley
requested changes
Feb 20, 2026
Collaborator
mthadley
left a comment
There was a problem hiding this comment.
Looks good! Just a few small changes before approving.
mthadley
approved these changes
Feb 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
enable_mirroringruns beforefill, the mirroring trigger copies recent rows into the intermediate table. Whenfillthen runs without--start, it callsdestTable.maxId()which returns a high ID (from the mirrored row). Fill queriesWHERE id > <high_id>, finds nothing, and returns "nothing to fill". All old rows below that ID are silently skipped.Root cause
The non-swapped, no-
--startbranch assumed dest was empty and useddestTable.maxId()as a resume cursor. With mirroring active, dest already has scattered high-ID rows, so this cursor skips everything below.Fix
1. Start from source min ID instead of dest max ID (
src/filler.ts)The non-swapped, no-
--startbranch now usessourceTable.minId(tx, timeFilter)instead ofdestTable.maxId(tx), matching the pattern already used bySynchronizer.init().2. Track source progress independently in batch loop (
src/filler.ts,src/types.ts)Restructured the batch CTE into two stages (
source_batchSELECT, theninsertedINSERT from it). The loop now terminates onsourceCount === 0(no more source rows) instead ofrowsInserted === 0(no rows inserted). Without this, re-running fill after a partial completion would hit an all-duplicate first batch,RETURNINGwould produce 0 rows, and the loop would stop before reaching uncopied rows.3. Fix
minId()ordering for non-monotonic PKs (src/table.ts)Backs out the
ORDER BY time LIMIT 1optimization from #21. That optimization was fast when a time index exists, but assumes PK/time monotonicity and can skip rows with smaller PKs and later timestamps (ULID skew, backfills, manual timestamps). We now order by PK to preserve correctness. A new test ("fills rows with non-monotonic PK and created_at") demonstrates the #21 failure mode where ordering by time picks a higher PK and skips an in-range row. This also fixesSynchronizer.init()which calls the same method.Resume behavior change
Previously, re-running
fillwithout--startwould auto-resume fromdestTable.maxId(). This was already broken when mirroring was active (the common case), since mirrored rows inflated the max ID and caused fill to skip everything below it.Now,
fillwithout--startalways starts from the beginning of the source table.ON CONFLICT DO NOTHINGmakes re-scanning already-copied rows cheap (no writes). For very large tables, resume explicitly using theendIdfrom the batch log:Resume with
--start 30000.Files changed
src/filler.tssourceCountterminationsrc/types.tssourceCounttoFillBatchResultsrc/table.tsminId()orders by PK only; docstring updatedsrc/pgslice.tssrc/commands/fill.tsrowsInsertedandendIdsrc/commands/fill.test.tssrc/fill.test.tsREADME.mdTest plan