Skip to content

Add exponential backoff to onError retry path#4337

Open
KyleAMathews wants to merge 3 commits into
mainfrom
on-error
Open

Add exponential backoff to onError retry path#4337
KyleAMathews wants to merge 3 commits into
mainfrom
on-error

Conversation

@KyleAMathews
Copy link
Copy Markdown
Contributor

Summary

Adds full-jitter exponential backoff to ShapeStream's onError retry path, preventing tight retry loops when persistent errors (e.g., 400s, exhausted network retries) trigger repeated onError → retry cycles. Uses the stream's existing backoffOptions timings for consistency with the fetch layer.

Root Cause

Non-429 4xx errors intentionally bypass the fetch-layer backoff so onError handlers can repair auth tokens or params before retrying. Network failures that exhaust fetch backoff also bubble up through onError. Previously, when the handler returned retry opts, the retry happened immediately with no delay — creating a tight loop that could hammer a server returning persistent errors.

Approach

A new #backoffOnErrorRetry method applies the same full-jitter exponential backoff strategy used by the fetch layer (L5 in SPEC.md). Key design decisions:

  • Reuses backoffOptions: No new configuration surface — inherits initialDelay, multiplier, maxDelay from the same options that configure fetch-level backoff
  • Abort-aware: The delay promise resolves immediately on abort signal, with a post-backoff check that tears down the stream instead of retrying
  • Consolidated state: Three separate private fields consolidated into a single #onErrorBackoff object

Key Invariants

  1. The consecutive error retry guard (50 max) is still the hard limit — backoff adds pacing, not a replacement
  2. The counter resets on successful message batch OR accepted 204
  3. Abort during backoff → immediate teardown, no leaked timers or listeners
  4. First retry gets delay in [0, initialDelay), subsequent retries escalate exponentially up to maxDelay

Non-goals

  • No separate onErrorBackoffOptions config — reusing backoffOptions avoids API surface growth
  • No debug logging during backoff — decided against to avoid log noise

Verification

cd packages/typescript-client
pnpm test --run  # requires Electric server on :3000

New tests:

  • backs off before retrying when onError returns retry options — verifies timing with mocked Math.random
  • tears down promptly when aborted during onError backoff — verifies abort-aware cleanup

Existing onError tests updated with fast backoffOptions (initialDelay: 2ms) to stay snappy.

Files Changed

File Change
src/client.ts New #backoffOnErrorRetry method, #onErrorBackoff config object, abort-aware delay at L5 call site
SPEC.md Updated L5 description, guard mechanism docs, and all loop-back line numbers
test/stream.test.ts 2 new tests (backoff timing, abort-during-backoff) + fast backoff options on 4 existing tests

🤖 Generated with Claude Code

KyleAMathews and others added 2 commits May 14, 2026 16:46
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 14, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@electric-sql/react@4337
npm i https://pkg.pr.new/@electric-sql/client@4337
npm i https://pkg.pr.new/@electric-sql/y-electric@4337

commit: b8b52f8

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.58%. Comparing base (6aa0186) to head (b8b52f8).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #4337       +/-   ##
===========================================
+ Coverage   36.09%   73.58%   +37.48%     
===========================================
  Files         172       81       -91     
  Lines       12150     8768     -3382     
  Branches     3977     2532     -1445     
===========================================
+ Hits         4386     6452     +2066     
+ Misses       7754     2305     -5449     
- Partials       10       11        +1     
Flag Coverage Δ
packages/agents-server 73.42% <ø> (-0.04%) ⬇️
packages/agents-server-ui ?
packages/electric-ax 37.59% <ø> (?)
packages/experimental 87.73% <ø> (?)
packages/react-hooks 86.48% <ø> (?)
packages/typescript-client 94.39% <100.00%> (?)
packages/y-electric 56.05% <ø> (?)
typescript 73.58% <100.00%> (+37.48%) ⬆️
unit-tests 73.58% <100.00%> (+37.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@alco
Copy link
Copy Markdown
Member

alco commented May 14, 2026

Fixes #3895 ?

The test uses fake timers with advanceTimersByTimeAsync(0), which
can't advance the new onError backoff timer. Adding zero-delay
backoffOptions ensures the retry fires within the fake timer ticks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants