Skip to content

fix(postgres): exclude error-closed runs from getCompletedRun check#195

Open
matingathani wants to merge 1 commit intostripe:mainfrom
matingathani:issue-189-completed-run
Open

fix(postgres): exclude error-closed runs from getCompletedRun check#195
matingathani wants to merge 1 commit intostripe:mainfrom
matingathani:issue-189-completed-run

Conversation

@matingathani
Copy link
Copy Markdown

Summary

getCompletedRun() was returning runs that closed with an error (e.g., API throttling). This caused the sync worker to treat a failed run as a successful one, skipping re-sync for up to 7 days.

Root cause: The SQL query only filtered on closed_at IS NOT NULL but did not check error_message. A run that hit a rate limit and closed with an error would satisfy the existing filter.

Fix: Add AND r.error_message IS NULL to the WHERE clause so only truly successful runs count.

Test plan

  • Integration test: create a run that closes with error_message set; confirm getCompletedRun() returns null instead of that run
  • Verify a clean successful run (null error_message) still satisfies the check

Closes #189

Copilot AI review requested due to automatic review settings March 30, 2026 00:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the sync-engine Postgres “recent successful run” detection to avoid treating errored runs as completed, preventing the worker from skipping resync after failures (e.g., Stripe throttling).

Changes:

  • Add an error_message IS NULL filter to the getCompletedRun() SQL query.
Comments suppressed due to low confidence (1)

packages/sync-engine/src/database/postgres.ts:786

  • This change affects whether reconciliation runs are skipped, but there’s no automated coverage for getCompletedRun() behavior (success vs error) in the existing integration suite. Add an integration test (e.g. alongside postgres-sync-observability.test.ts) that creates a run with at least one errored object run, closes the run, and asserts getCompletedRun() returns null for that window while still returning a run for an all-complete case.
  async getCompletedRun(
    accountId: string,
    intervalSeconds: number
  ): Promise<{ accountId: string; runStartedAt: Date } | null> {
    const result = await this.query(
      `SELECT r."_account_id", r.started_at
       FROM "${this.syncSchema}"."_sync_runs" r
       WHERE r."_account_id" = $1
         AND r.closed_at IS NOT NULL
         AND r.closed_at >= now() - make_interval(secs => $2)
         AND r.error_message IS NULL
       LIMIT 1`,
      [accountId, intervalSeconds]
    )

    if (result.rows.length === 0) return null
    const row = result.rows[0]
    return { accountId: row._account_id, runStartedAt: row.started_at }
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 773 to 779
`SELECT r."_account_id", r.started_at
FROM "${this.syncSchema}"."_sync_runs" r
WHERE r."_account_id" = $1
AND r.closed_at IS NOT NULL
AND r.closed_at >= now() - make_interval(secs => $2)
AND r.error_message IS NULL
LIMIT 1`,
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getCompletedRun() now filters on r.error_message IS NULL, but _sync_runs.error_message is not set when object runs fail (errors are stored on _sync_obj_runs.error_message and surfaced via the sync_runs view). As a result, errored runs can still be treated as “successful”. Consider instead checking the derived run status (e.g. query "${this.syncSchema}"."sync_runs" with status = 'complete') or adding a NOT EXISTS subquery against _sync_obj_runs for status = 'error' for the run.

Copilot uses AI. Check for mistakes.
WHERE r."_account_id" = $1
AND r.closed_at IS NOT NULL
AND r.closed_at >= now() - make_interval(secs => $2)
AND r.error_message IS NULL
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query uses LIMIT 1 without an ORDER BY, so Postgres may return an arbitrary matching run. If the intent is to check the most recent successful completion (and log it), add an explicit ORDER BY r.closed_at DESC (or started_at DESC) before LIMIT 1 for deterministic behavior.

Suggested change
AND r.error_message IS NULL
AND r.error_message IS NULL
ORDER BY r.closed_at DESC

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stripe_worker not checking sync_job for error status, not syncing for 7 days after an error occurs.

2 participants