Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds pub_commit_lsn to CDC metadata, captures publication commit LSN during node init, enforces a publication-commit ordering guard at replication stream startup, isolates metadata writes from Spock repsets via session-scoped repair mode and connection GUCs, and adds integration tests validating ordering and guards. ChangesCDC Publication-Commit Ordering and Isolation
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Category | Results |
|---|---|
| Complexity | 2 medium |
🟢 Metrics 8 complexity · 4 duplication
Metric Results Complexity 8 Duplication 4
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
db/queries/templates.go (1)
376-387:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winSplit this multi-statement template into separate
Execcalls.The template emits
CREATE TABLE ...; ALTER TABLE ...as a single SQL string. pgx v5 uses the extended protocol by default, which does not support multiple statements in a singleExec()call. This will fail at runtime. Either split into two separate templates with twoExeccalls, or explicitly configure the connection to useQueryExecModeSimpleProtocol(not found in the codebase).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@db/queries/templates.go` around lines 376 - 387, The template CreateCDCMetadataTable currently emits two SQL statements ("CREATE TABLE ...;" and "ALTER TABLE ...;") in one string which fails under pgx v5 extended protocol; split this into two separate templates and Exec calls: keep the CREATE TABLE statement in CreateCDCMetadataTable (or a renamed createAceCDCMetadataTable) and move the ALTER TABLE ... ADD COLUMN IF NOT EXISTS pub_commit_lsn into a new template (e.g., AddPubCommitLsnToAceCDCMetadata) and call Exec for each template separately, referencing {{aceSchema}} and ace_cdc_metadata in both so they run sequentially at migration time.
🧹 Nitpick comments (1)
internal/infra/cdc/setup.go (1)
90-99: 🏗️ Heavy liftConsider constraining the API signature to enforce session pinning.
All current call sites correctly use pinned transactions (
txfrompool.Begin()), but the function signature accepts a genericqueries.DBQuerier. While the wrapper atdb/queries/queries.go:3007documents thatSet(true)andSet(false)must pair on the same session, the generic API allows future callers to pass a pool handle and inadvertently separate the toggles across sessions, poisoning one connection.Consider adding compile-time enforcement (e.g., a type wrapper or interface constraint) to make session-pinning explicit and prevent misuse.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/infra/cdc/setup.go` around lines 90 - 99, SetSpockRepairMode currently accepts a generic queries.DBQuerier which lets callers pass a pool handle and accidentally toggle repair mode on different sessions; change the API to require a session-bound type (e.g., accept the transaction/session interface used by pool.Begin() such as a queries.DBTx or a new interface like queries.SessionQuerier) so SetSpockRepairMode(ctx context.Context, session queries.SessionQuerier, on bool) enforces compile-time session pinning; update all callers to pass the tx returned from Begin() and call queries.CheckSpockInstalled and queries.SetSpockRepairMode on that session so Set(true)/Set(false) always occur on the same connection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 866-870: The call to queries.CurrentWalInsertLSN that sets
pubCommitLSN is happening while Phase A's transaction (tx) is still open; move
the WAL LSN capture so it occurs after Phase A commits and before Phase B starts
so the read reflects the commit LSN. Concretely: finish and commit the Phase A
transaction first, then call queries.CurrentWalInsertLSN (using an appropriate
context/connection outside the committed tx or a new read-only tx) to set
pubCommitLSN, and only then proceed to Phase B logic (the code that relies on
pubCommitLSN and uses tx/slot-guard checks).
In `@internal/infra/cdc/listen.go`:
- Around line 467-489: After calling SetSpockRepairMode(processingCtx, tx, true)
you must immediately schedule cleanup so session-scoped repair_mode is always
disabled; add defer SetSpockRepairMode(processingCtx, tx, false) right after the
successful true call (before calling queries.UpdateCDCMetadata or committing tx)
so any early return or error path will run the disable. Update or remove the
later explicit SetSpockRepairMode(..., false) calls as needed to avoid
double-disabling or handle their errors gracefully; ensure the deferred call
runs regardless of conn.Close or tx.Commit failures and continues to return the
connection to the pool in a neutral state.
---
Outside diff comments:
In `@db/queries/templates.go`:
- Around line 376-387: The template CreateCDCMetadataTable currently emits two
SQL statements ("CREATE TABLE ...;" and "ALTER TABLE ...;") in one string which
fails under pgx v5 extended protocol; split this into two separate templates and
Exec calls: keep the CREATE TABLE statement in CreateCDCMetadataTable (or a
renamed createAceCDCMetadataTable) and move the ALTER TABLE ... ADD COLUMN IF
NOT EXISTS pub_commit_lsn into a new template (e.g.,
AddPubCommitLsnToAceCDCMetadata) and call Exec for each template separately,
referencing {{aceSchema}} and ace_cdc_metadata in both so they run sequentially
at migration time.
---
Nitpick comments:
In `@internal/infra/cdc/setup.go`:
- Around line 90-99: SetSpockRepairMode currently accepts a generic
queries.DBQuerier which lets callers pass a pool handle and accidentally toggle
repair mode on different sessions; change the API to require a session-bound
type (e.g., accept the transaction/session interface used by pool.Begin() such
as a queries.DBTx or a new interface like queries.SessionQuerier) so
SetSpockRepairMode(ctx context.Context, session queries.SessionQuerier, on bool)
enforces compile-time session pinning; update all callers to pass the tx
returned from Begin() and call queries.CheckSpockInstalled and
queries.SetSpockRepairMode on that session so Set(true)/Set(false) always occur
on the same connection.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5811b76d-c669-4f27-abcf-7fa388daaab6
📒 Files selected for processing (7)
db/queries/queries.godb/queries/templates.gointernal/consistency/mtree/merkle.gointernal/infra/cdc/listen.gointernal/infra/cdc/setup.gotests/integration/cdc_busy_table_test.gotests/integration/cdc_init_ordering_test.go
There was a problem hiding this comment.
Diagnosis and ordering invariant look sound; tests cover the right invariants. Notes inline. PR-level nit: the body says "two commits (fix-first, then test)" but the branch has four; either update the description or rebase back to two for a bisect-clean history.
…ming) Round of fixes for the human review comments on PR #119: - flushMetadata (listen.go:542): same pool-poisoning hole as Site 1. Port to the borrowed-*pgxpool.Conn pattern so Set(false) is deferred on the live session before Release. This site fires every 10 s on the periodic-flush path AND on shutdown, so it was the highest-risk remaining leak. - MtreeInit Phase C (merkle.go:884): same hole. Borrow the conn, defer Set(false) on it, run the InitCDCMetadata write through a tx opened on the same conn. - Guard scope comment (listen.go:103): clarify that the pub_commit_lsn guard is a tripwire for partial-state-on-upgrade and for cross-node leaks whose LSN lands below local pub_commit_lsn — NOT a general leak detector. The actual defenses against cross-node leak are repset exclusion + repair_mode wrapping every metadata write. Reader-of-the-future shouldn't mistake this for an end-to-end check. - Test timing (cdc_init_ordering_test.go:186): bump the sentinel-propagation window from 3 s to 10 s. 3 s was tight under load on slow CI; 10 s gives Spock apply latency real headroom without making the test materially slower in the common case. Two non-blocking comments deliberately deferred to follow-ups: - CheckSpockInstalled cache on the pool (~12 RTTs/min savings) - Hoist 'ace_cdc_metadata' relname into a shared constant CodeRabbit's pubCommitLSN-timing finding remains skipped: the invariant chain captured_mid_tx < commit_lsn(Phase A) <= slot.consistent_point holds for freshly-created slots, which is the only state Phase B produces. The hypothetical "broken slot whose consistent_point sits between captured and commit" cannot arise from this code path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/infra/cdc/listen.go`:
- Around line 568-575: The deferred reset of spock.repair_mode in flushMetadata
currently uses the caller's cancelable ctx which can be canceled before defer
runs; change the defer to use a non-cancelable context (like the pattern in
processReplicationStream) so the reset always runs: keep the initial call
SetSpockRepairMode(ctx, c, true) as-is, then create a non-cancelable context
(e.g. ctxNoCancel := context.WithoutCancel(ctx)) and call
SetSpockRepairMode(ctxNoCancel, c, false) inside the defer; reference
SetSpockRepairMode and flushMetadata to locate and update the defer.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 582e9095-53ab-40a8-a71b-ea48d3c87de7
📒 Files selected for processing (3)
internal/consistency/mtree/merkle.gointernal/infra/cdc/listen.gotests/integration/cdc_init_ordering_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/integration/cdc_init_ordering_test.go
- internal/consistency/mtree/merkle.go
dbd3724 to
dcc6d9c
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/integration/cdc_init_ordering_test.go (1)
184-186:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale comment: mentions "3 s window" but code uses 10 seconds.
The comment references the original 3-second timeout, but the deadline was bumped to 10 seconds per prior review feedback.
📝 Suggested fix
// assertSentinelDoesNotPropagate writes a sentinel start_lsn on n1 and -// verifies it never appears on n2 within a 3 s window. Under bug 2 the +// verifies it never appears on n2 within a 10 s window. Under bug 2 the // sentinel would propagate via Spock apply within seconds.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/cdc_init_ordering_test.go` around lines 184 - 186, Update the stale comment above assertSentinelDoesNotPropagate to reflect the current 10 second timeout (instead of "3 s window"); locate the comment that mentions "3 s window" in the assertSentinelDoesNotPropagate test and change it to state "10 s window" or "10-second window" so it matches the test's actual deadline.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tests/integration/cdc_init_ordering_test.go`:
- Around line 184-186: Update the stale comment above
assertSentinelDoesNotPropagate to reflect the current 10 second timeout (instead
of "3 s window"); locate the comment that mentions "3 s window" in the
assertSentinelDoesNotPropagate test and change it to state "10 s window" or
"10-second window" so it matches the test's actual deadline.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c2d4b8b7-a44a-434f-9397-23eb7b69ae3e
📒 Files selected for processing (6)
db/queries/queries.godb/queries/templates.gointernal/consistency/mtree/merkle.gointernal/infra/cdc/listen.gointernal/infra/cdc/setup.gotests/integration/cdc_init_ordering_test.go
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@db/queries/queries.go`:
- Around line 2919-2928: GetCDCMetadata currently fails when the
ace_cdc_metadata table lacks pub_commit_lsn; modify the function
(GetCDCMetadata) to detect Postgres "undefined_column" (SQLSTATE 42703) from the
db.QueryRow(...).Scan(...) error and treat it as a non-fatal condition: if the
error is a Postgres error with Code == "42703" (use errors.As to check for the
driver-specific error type such as *pgconn.PgError or *pq.Error depending on the
codebase), return slotName, startLSN, tables and an empty pubCommitLSN with nil
error so callers can run the legacy fallback; otherwise return the original
error as before. Ensure the detection targets the error code "42703"
specifically.
In `@internal/infra/cdc/listen.go`:
- Around line 113-126: Update the recovery guidance text in the ace_cdc_metadata
LSN checks so it includes "ace mtree build" alongside "ace mtree teardown" and
"ace mtree init"; specifically change the message passed to logger.Warn (when
pubCommitLSNStr == "") and the error returned in the startLSN < pubCommitLSN
branch where the fmt.Errorf is created (these are the strings used in
logger.Warn, logger.Error and the fmt.Errorf in listen.go) to list "ace mtree
teardown", "ace mtree init", and "ace mtree build" in that order so users are
directed to run build after init.
In `@internal/infra/cdc/setup.go`:
- Around line 68-71: The schema-qualified table name constructed in the variable
qualified (currently fmt.Sprintf("%s.ace_cdc_metadata", aceSchema)) must use
proper SQL identifier quoting so Spock sees the exact relation; update the code
that builds qualified before calling queries.RepsetRemoveTable(ctx, db, set,
qualified) to quote aceSchema (and the ace_cdc_metadata identifier) with the
project’s identifier-quoting helper (or a safe QuoteIdentifier function) and
then pass the quoted qualified name (e.g., quotedSchema + "." + quotedTable) to
RepsetRemoveTable so identifiers with capitals or special chars resolve
correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 81b5d968-23d1-46f0-bf8b-f18b6c962f24
📒 Files selected for processing (7)
db/queries/queries.godb/queries/templates.gointernal/consistency/mtree/merkle.gointernal/infra/cdc/listen.gointernal/infra/cdc/setup.gotests/integration/cdc_busy_table_test.gotests/integration/cdc_init_ordering_test.go
Fixes 'publication "ace_mtree_pub" does not exist' (SQLSTATE 42704) during 'ace mtree table-diff'. Init order: the replication slot was being created before the publication was committed to WAL. The slot's consistent point therefore preceded the publication, and pgoutput's get_publication_oid failed during change-callback replay. MtreeInit is now split into three phases per node: Phase A commits schema + metadata table + publication and captures the publication commit LSN; Phase B creates the slot (its consistent point is now strictly after the publication); Phase C persists slot/start_lsn/ pub_commit_lsn. A new pub_commit_lsn column is added to ace_cdc_metadata (additive, ALTER TABLE IF NOT EXISTS). processReplicationStream refuses to open a stream whose start_lsn is older than pub_commit_lsn, returning an actionable error instead of silently rewinding to a pre-publication LSN. Empty pub_commit_lsn (legacy metadata rows) skips the check with a warning. Existing broken installs are not auto-repaired; users must run 'ace mtree teardown' + 'ace mtree init' + 'ace mtree build' after upgrade. '--skip-cdc' remains a valid interim workaround for pre-upgrade clusters.
Two integration tests against the spock cluster fixture: - TestMtreeInitSlotIsAfterPublication: after MtreeInit, every node's slot consistent_point must be >= the publication's commit LSN recorded in ace_cdc_metadata.pub_commit_lsn. This is the ordering invariant the 3-phase init split restores. - TestProcessReplicationStreamRejectsStaleStartLSN: corrupts n1's start_lsn to a value below pub_commit_lsn and asserts UpdateFromCDC returns the new guard's actionable error rather than the cryptic SQLSTATE 42704 that pgoutput would otherwise emit.
BuildMtree's per-node body opens a tx with defer tx.Rollback in the outer function scope, then on any error explicitly calls pool.Close() and returns the error. pool.Close() waits on puddle's WaitGroup for every acquired conn to be released — but the conn the tx holds will not be released until the deferred tx.Rollback runs, and that defer cannot run until the function returns, which is being blocked by pool.Close(). Classic Go defer-vs-explicit-close deadlock: the process hangs and the underlying error never surfaces. The shape repeats at every error site inside the per-iteration body (AlterPublicationAddTable, GetCDCMetadata, UpdateCDCMetadata, createMtreeObjects, insertBlockRanges, computeLeafHashes, buildParentNodes, tx.Commit). All eight are reachable; any one being hit hangs the build. Wrap the per-iteration body in a closure so defers are scoped to one iteration. defer pool.Close lands above defer tx.Rollback in registration order, so LIFO fires tx.Rollback first (releases the conn) then pool.Close (no acquired conns to wait for). The explicit pool.Close calls inside each error branch go away — defers handle it uniformly on success and failure. Confirmed via SIGQUIT goroutine dump showing main blocked in puddle.Pool.Close.deferwrap1 → WaitGroup.Wait, called from BuildMtree's createMtreeObjects error branch. The same defer-stacking pattern exists in the loop further down BuildMtree (around the UpdateMtree block) and in a few other *.Close sites elsewhere in this file. They don't deadlock today because either no tx is held or the order happens to be safe, but the closure pattern would make them uniformly correct. Left for a follow-up so this commit stays focused on the reproduced bug.
createMtreeObjects has called CreateSchema and CreateXORFunction at the start of every node's build iteration since the original BuildMtree commit (a05445e, July 2025) — back when build was the entry point and had to set up the schema and XOR function itself. When CDC-based MtreeInit landed (fd02c30, August 2025), those calls became redundant: init's Phase A creates both, and BuildMtree implicitly requires init to have run (it consumes the CDC publication, slot, and metadata that init sets up). The build-side calls were never cleaned up. The redundancy was harmless on vanilla PG. On Spock-replicated clusters it surfaces as SQLSTATE XX000 "tuple concurrently updated": CREATE OR REPLACE FUNCTION unconditionally writes pg_proc, and Spock's DDL apply of n1's BuildMtree replicating to n2 races with n2's BuildMtree running the same statement. The race was masked before the previous commit by the BuildMtree pool.Close deadlock that swallowed the error and hung the process. CREATE TABLE IF NOT EXISTS for ace_mtree_metadata stays — IF NOT EXISTS short-circuits without touching pg_class when the relation already exists, so the same race doesn't apply.
ACE issues DDL exclusively against its own pgedge_ace schema and the ace_mtree_pub publication — never against user objects — and every piece of that DDL is intentionally per-node. Letting Spock replicate it produces a chain of cross-node interactions: - ace_cdc_metadata gets auto-added to peer nodes' default repsets by Spock's CREATE TABLE event trigger, leaking per-node LSNs cross-node. - CREATE OR REPLACE FUNCTION on pgedge_ace.bytea_xor races on pg_proc when n1's build replicates to n2 while n2's build runs the same statement (SQLSTATE XX000). - n2's SetupPublication DROP+CREATE replicates back to n1, landing in n1's pgoutput stream mid-replay and producing the original SQLSTATE 42704 "publication does not exist" the PR was meant to fix. Set spock.enable_ddl_replication = off and spock.include_ddl_repset = off as connection startup options on every ACE pool. Both are PGC_USERSET in Spock, so connection-level options override cluster defaults. Both are dotted custom GUCs, so PostgreSQL accepts them on vanilla-PG instances without Spock loaded — the parameters become placeholder custom variables with no effect, keeping the dual-mode native-PG path intact. Once this is verified in production, the per-DDL workarounds in earlier commits (post-init repset cleanup sweep, SetSpockRepairMode wrapping at every metadata-write site, ExcludeMetadataFromSpockRepsets itself) all become removable. Leaving them in place for now as defense-in-depth.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
internal/infra/cdc/listen.go (1)
113-126:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winInclude
ace mtree buildin the recovery guidance.The supplied reviewer notes say broken installs need teardown, init, and build. Stopping the warning/error text at init still leaves users with an unbuilt tree after they follow the message.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/infra/cdc/listen.go` around lines 113 - 126, The user-facing recovery guidance strings in the error and warning paths (logger.Warn call and the fmt.Errorf returned when startLSN < pubCommitLSN in listen.go) currently instruct to run 'ace mtree teardown' and 'ace mtree init' but omit the required 'ace mtree build' step; update both the logger.Warn message (the call to logger.Warn referencing ace_cdc_metadata and publication) and the returned fmt.Errorf message (the error constructed when startLSN < pubCommitLSN) to append "and 'ace mtree build'" so the guidance reads "run 'ace mtree teardown', 'ace mtree init' and 'ace mtree build' to recover".internal/consistency/mtree/merkle.go (1)
861-869:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't persist a pre-commit WAL position as
pub_commit_lsn.
CurrentWalInsertLSNis still called while Phase A is open. The commit record lands later, so a stalestart_lsnthat falls between this captured value and the real commit LSN will bypass the new guard and can still reproduce SQLSTATE 42704. Capture the WAL position only after Phase A commits.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/consistency/mtree/merkle.go` around lines 861 - 869, The code captures pubCommitLSN via CurrentWalInsertLSN while Phase A is still open, risking a pre-commit WAL position; move the capture so it runs after the Phase A transaction is committed: call tx.Commit(m.Ctx) first (or ensure the transaction has committed) and only then call queries.CurrentWalInsertLSN(m.Ctx, tx) to set pubCommitLSN, updating any error handling and the surrounding logic in merkle.go (references: pubCommitLSN, CurrentWalInsertLSN, tx.Commit) so the persisted pub_commit_lsn reflects the post-commit WAL position.db/queries/queries.go (1)
2919-2928:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle pre-migration
ace_cdc_metadatalayouts before readingpub_commit_lsn.This path still hard-fails on upgraded clusters whose metadata table predates the new column: the
SELECTfails before callers can treatpubCommitLSN == ""as the legacy case. Catchundefined_columnand retry the legacy 3-column read, or run the additive migration before every read.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@db/queries/queries.go` around lines 2919 - 2928, GetCDCMetadata currently fails when the pub_commit_lsn column is missing on older schemas; modify GetCDCMetadata to detect the undefined_column error (SQLSTATE 42703 or error text containing "undefined_column") returned from db.QueryRow when using SQLTemplates.GetCDCMetadata, and on that specific error re-run a legacy 3-column query (e.g. render and QueryRow using a legacy template like SQLTemplates.GetCDCMetadataLegacy or a SQL string selecting only slot_name, start_lsn, tables), Scan into slotName, startLSN, tables, set pubCommitLSN = "" and return nil; for any other error continue to return the original error. Ensure you reference GetCDCMetadata, RenderSQL, SQLTemplates.GetCDCMetadata and the DBQuerier.QueryRow/Scan calls when implementing the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 1453-1535: During the per-node loop you only defer pool.Close()
for the node handled in the closure, leaving other entries in the pools map open
on early return; fix by removing each pool from the pools map once you take
ownership and also add an outer cleanup after the loop to close any remaining
pools. Specifically, inside the closure after obtaining pool :=
pools[nodeInfo["Name"].(string)] and deferring pool.Close(), call delete(pools,
nodeInfo["Name"].(string)) so the map no longer retains that pool; then after
the for _, nodeInfo := range m.ClusterNodes loop finishes (or on error return
path), iterate remaining entries in pools and call Close() on each to ensure
none are leaked. Refer to symbols: pools, nodeInfo, pool.Close(), the
per-iteration closure around the loop and m.ClusterNodes.
---
Duplicate comments:
In `@db/queries/queries.go`:
- Around line 2919-2928: GetCDCMetadata currently fails when the pub_commit_lsn
column is missing on older schemas; modify GetCDCMetadata to detect the
undefined_column error (SQLSTATE 42703 or error text containing
"undefined_column") returned from db.QueryRow when using
SQLTemplates.GetCDCMetadata, and on that specific error re-run a legacy 3-column
query (e.g. render and QueryRow using a legacy template like
SQLTemplates.GetCDCMetadataLegacy or a SQL string selecting only slot_name,
start_lsn, tables), Scan into slotName, startLSN, tables, set pubCommitLSN = ""
and return nil; for any other error continue to return the original error.
Ensure you reference GetCDCMetadata, RenderSQL, SQLTemplates.GetCDCMetadata and
the DBQuerier.QueryRow/Scan calls when implementing the change.
In `@internal/consistency/mtree/merkle.go`:
- Around line 861-869: The code captures pubCommitLSN via CurrentWalInsertLSN
while Phase A is still open, risking a pre-commit WAL position; move the capture
so it runs after the Phase A transaction is committed: call tx.Commit(m.Ctx)
first (or ensure the transaction has committed) and only then call
queries.CurrentWalInsertLSN(m.Ctx, tx) to set pubCommitLSN, updating any error
handling and the surrounding logic in merkle.go (references: pubCommitLSN,
CurrentWalInsertLSN, tx.Commit) so the persisted pub_commit_lsn reflects the
post-commit WAL position.
In `@internal/infra/cdc/listen.go`:
- Around line 113-126: The user-facing recovery guidance strings in the error
and warning paths (logger.Warn call and the fmt.Errorf returned when startLSN <
pubCommitLSN in listen.go) currently instruct to run 'ace mtree teardown' and
'ace mtree init' but omit the required 'ace mtree build' step; update both the
logger.Warn message (the call to logger.Warn referencing ace_cdc_metadata and
publication) and the returned fmt.Errorf message (the error constructed when
startLSN < pubCommitLSN) to append "and 'ace mtree build'" so the guidance reads
"run 'ace mtree teardown', 'ace mtree init' and 'ace mtree build' to recover".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1d519589-d288-4641-bff9-3afbd4610262
📒 Files selected for processing (8)
db/queries/queries.godb/queries/templates.gointernal/consistency/mtree/merkle.gointernal/infra/cdc/listen.gointernal/infra/cdc/setup.gointernal/infra/db/auth.gotests/integration/cdc_busy_table_test.gotests/integration/cdc_init_ordering_test.go
✅ Files skipped from review due to trivial changes (1)
- internal/infra/db/auth.go
BuildMtree pre-builds a pool per node up front in the row-estimate loop, then hands the populated map to the per-node build loop which closes each pool only as its own iteration completes. If a later iteration errors out and exits, pools for nodes that loop never reached stay open for the rest of the process lifetime — a real leak in HTTP API server invocations, mostly invisible under CLI invocation where the process exits anyway. Add a defer at the top of the function that closes every entry in the pools map on exit. pgxpool.Pool.Close uses sync.Once internally, so the per-iteration closes still take effect promptly and this defer is a no-op for pools already closed by them. Pre-existing leak; the closure refactor for the pool.Close deadlock didn't introduce or remove it.
The pub_commit_lsn column is added by MtreeInit's Phase A via the
additive ALTER TABLE in CreateCDCMetadataTable. On clusters upgraded
to this binary without re-running init — say, going straight to
mtree build or mtree table-diff — the table still has the old
3-column layout. The previous SELECT referenced pub_commit_lsn as a
column, so PostgreSQL raised SQLSTATE 42703 ("column does not
exist") before the listen.go fallback for an empty pub_commit_lsn
could fire. CDC startup / build aborted instead of warning and
continuing.
Extract pub_commit_lsn via to_jsonb(row) ->> 'pub_commit_lsn'
instead of a direct column reference. The SQL parses against any
schema version; on legacy rows the JSON object has no key for the
column, ->> returns NULL, COALESCE produces the empty string the
listen.go guard already treats as "invariant uncheckable — warn and
skip". The next MtreeInit still backfills the real column via the
additive ALTER TABLE in CreateCDCMetadataTable.
The two recovery messages in processReplicationStream told users to run 'ace mtree teardown' and 'ace mtree init' but stopped there. After init, the merkle tree itself is empty; a subsequent table-diff would either fail or produce incorrect results. 'ace mtree build' is the final step that repopulates the tree and makes table-diff usable again. Updated both messages to spell out the full three-step sequence.
Summary
Fixes the SQLSTATE 42704 - publication "ace_mtree_pub" does not exist error during ace mtree table-diff, plus two BuildMtree bugs surfaced while testing.
Root causes
Init ordering. MtreeInit was creating the replication slot before the publication was committed to WAL, so the slot's consistent_point preceded the publication. pgoutput's get_publication_oid then failed during change-callback replay. This is a pure pgoutput correctness issue that applies to any logical-replication setup, spock or vanilla.
Spock DDL replication. ACE's CREATE TABLE / CREATE PUBLICATION / CREATE OR REPLACE FUNCTION statements on its private pgedge_ace schema were being DDL-replicated to peer nodes. That produced a cascade of cross-node interactions:
leaking per-node LSNs cross-node and clobbering start_lsn.
n2's build ran the same statement (SQLSTATE XX000 "tuple concurrently updated").
producing the original 42704.
Fix shape
Init-ordering split (commit 1): Split MtreeInit into three phases per node — Phase A commits schema + metadata table + publication and captures the publication's commit LSN; Phase B creates the slot (consistent point is now strictly after the publication); Phase C persists slot/start_lsn/pub_commit_lsn. A new pub_commit_lsn column (additive, ALTER TABLE ... ADD IF NOT EXISTS) lets processReplicationStream refuse to start a stream whose start_lsn predates it, surfacing an actionable error rather than the cryptic 42704. Empty pub_commit_lsn (legacy metadata rows) skips the check with a warning.
Spock DDL suppression (commit 5): Set spock.enable_ddl_replication = off and spock.include_ddl_repset = off as connection startup options on every ACE pool in applyRuntimeParams. ACE issues DDL exclusively against pgedge_ace and the ace_mtree_pub publication — never against user objects — so suppressing replication of its own DDL is correct unconditionally. Both GUCs are PGC_USERSET in spock so connection-level options override cluster defaults; both have dotted names so vanilla-PG instances without spock accept them as placeholder custom vars with no effect (dual-mode safe).
acquired conn to be released — but the tx's conn won't be released until the deferred tx.Rollback runs, and that defer cannot run until the function returns, which is blocked by pool.Close(). Classic defer-vs-explicit-close deadlock. Any failure inside BuildMtree's per-node body hung the process and swallowed the error. Wrapped the per-iteration body in a closure so defers are scoped per-iteration and fire in LIFO order (tx.Rollback first, then pool.Close).
BuildMtree redundant DDL (commit 4): createMtreeObjects was calling CreateSchema and CreateXORFunction at the start of every node's build, leftover from the original BuildMtree (July 2025, predating MtreeInit). MtreeInit's Phase A creates both, and BuildMtree implicitly requires init to have run. The redundancy was harmless on vanilla PG but racy on spock — the redundant CREATE OR REPLACE FUNCTION was the specific call hitting the XX000 race above. Removed both redundant calls. CREATE TABLE IF NOT EXISTS for ace_mtree_metadata stays — IF NOT EXISTS short-circuits without touching pg_class.
Regression coverage
tests/integration/cdc_init_ordering_test.go:
every node after init.
and asserts UpdateFromCDC returns the new guard's actionable error.
Both run against the existing spock cluster fixture; no fixture changes required.
Upgrade notes
Existing broken installs are not auto-repaired. Users hitting the 42704 must run:
ace mtree teardown -d '<schema.table>'
ace mtree init -d
ace mtree build -d '<schema.table>'
If teardown itself fails, drop manually on every node:
DROP TABLE IF EXISTS pgedge_ace.ace_mtree__
CASCADE;DROP TABLE IF EXISTS pgedge_ace.ace_cdc_metadata CASCADE;
DROP TABLE IF EXISTS pgedge_ace.ace_mtree_metadata CASCADE;
DROP PUBLICATION IF EXISTS ace_mtree_pub;
SELECT pg_drop_replication_slot('ace_mtree_slot') WHERE EXISTS (SELECT 1 FROM pg_replication_slots WHERE
slot_name = 'ace_mtree_slot');
--skip-cdc remains a valid interim workaround for pre-upgrade clusters.