fix(postgres): exclude error-closed runs from getCompletedRun check#195
fix(postgres): exclude error-closed runs from getCompletedRun check#195matingathani wants to merge 1 commit intostripe:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adjusts the sync-engine Postgres “recent successful run” detection to avoid treating errored runs as completed, preventing the worker from skipping resync after failures (e.g., Stripe throttling).
Changes:
- Add an
error_message IS NULLfilter to thegetCompletedRun()SQL query.
Comments suppressed due to low confidence (1)
packages/sync-engine/src/database/postgres.ts:786
- This change affects whether reconciliation runs are skipped, but there’s no automated coverage for
getCompletedRun()behavior (success vs error) in the existing integration suite. Add an integration test (e.g. alongsidepostgres-sync-observability.test.ts) that creates a run with at least one errored object run, closes the run, and assertsgetCompletedRun()returnsnullfor that window while still returning a run for an all-complete case.
async getCompletedRun(
accountId: string,
intervalSeconds: number
): Promise<{ accountId: string; runStartedAt: Date } | null> {
const result = await this.query(
`SELECT r."_account_id", r.started_at
FROM "${this.syncSchema}"."_sync_runs" r
WHERE r."_account_id" = $1
AND r.closed_at IS NOT NULL
AND r.closed_at >= now() - make_interval(secs => $2)
AND r.error_message IS NULL
LIMIT 1`,
[accountId, intervalSeconds]
)
if (result.rows.length === 0) return null
const row = result.rows[0]
return { accountId: row._account_id, runStartedAt: row.started_at }
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| `SELECT r."_account_id", r.started_at | ||
| FROM "${this.syncSchema}"."_sync_runs" r | ||
| WHERE r."_account_id" = $1 | ||
| AND r.closed_at IS NOT NULL | ||
| AND r.closed_at >= now() - make_interval(secs => $2) | ||
| AND r.error_message IS NULL | ||
| LIMIT 1`, |
There was a problem hiding this comment.
getCompletedRun() now filters on r.error_message IS NULL, but _sync_runs.error_message is not set when object runs fail (errors are stored on _sync_obj_runs.error_message and surfaced via the sync_runs view). As a result, errored runs can still be treated as “successful”. Consider instead checking the derived run status (e.g. query "${this.syncSchema}"."sync_runs" with status = 'complete') or adding a NOT EXISTS subquery against _sync_obj_runs for status = 'error' for the run.
| WHERE r."_account_id" = $1 | ||
| AND r.closed_at IS NOT NULL | ||
| AND r.closed_at >= now() - make_interval(secs => $2) | ||
| AND r.error_message IS NULL |
There was a problem hiding this comment.
The query uses LIMIT 1 without an ORDER BY, so Postgres may return an arbitrary matching run. If the intent is to check the most recent successful completion (and log it), add an explicit ORDER BY r.closed_at DESC (or started_at DESC) before LIMIT 1 for deterministic behavior.
| AND r.error_message IS NULL | |
| AND r.error_message IS NULL | |
| ORDER BY r.closed_at DESC |
Summary
getCompletedRun()was returning runs that closed with an error (e.g., API throttling). This caused the sync worker to treat a failed run as a successful one, skipping re-sync for up to 7 days.Root cause: The SQL query only filtered on
closed_at IS NOT NULLbut did not checkerror_message. A run that hit a rate limit and closed with an error would satisfy the existing filter.Fix: Add
AND r.error_message IS NULLto the WHERE clause so only truly successful runs count.Test plan
error_messageset; confirmgetCompletedRun()returnsnullinstead of that runCloses #189