Skip to content

feat(spec): resilientFetch — timeout + backoff for outbound HTTP (P1-1)#1593

Merged
xuyushun441-sys merged 1 commit into
mainfrom
fix/external-call-resilience
Jun 5, 2026
Merged

feat(spec): resilientFetch — timeout + backoff for outbound HTTP (P1-1)#1593
xuyushun441-sys merged 1 commit into
mainfrom
fix/external-call-resilience

Conversation

@xuyushun441-sys

Copy link
Copy Markdown
Contributor

Closes launch-readiness P1-1. Verified all 4 sites against main (all real: naked fetch / SDK call with no timeout or retry).

Fix

New shared resilientFetch (@objectstack/spec/shared) — dependency-free wrapper:

  • per-attempt timeout via AbortController (default 30s);
  • exponential backoff + jitter, up to 3 attempts, on network errors / 429 / 5xx;
  • honours Retry-After on a 429;
  • never retries a caller-initiated abort (intentional cancellation ≠ transient failure).

Wired into the three raw-fetch packages: connector-rest, connector-slack, embedder-openai. connector-mcp talks through the MCP SDK transport, so it instead gets a 30s per-request timeout on callTool / listTools.

Scope decision

A stateful per-host circuit breaker (in the original action) is deliberately deferred as a follow-up — timeout + backoff already removes the stated "slow/rate-limited API hangs the agent turn with no recovery" risk; a breaker is separate hardening that needs per-host state design. Per-call configurability of timeout/retry is a small follow-up too. Noted in the doc.

Tests (+13)

resilient-fetch.test.ts (9): success-no-retry, 429-retry, 5xx-retry, give-up-after-N, non-retryable-4xx, Retry-After honoured, timeout, network-error-retry, caller-abort-no-retry. Plus a connector-rest end-to-end retry test. Existing connector/embedder suites green (rest 11, slack 7, mcp 8, embedder 14).

Also added the missing @objectstack/spec/shared alias to embedder's vitest config (it had /contracts but not /shared, so the new value-import didn't resolve under its coarse @objectstack/spec → src/index.ts alias).

🤖 Generated with Claude Code

Connectors/embedder used naked fetch with no timeout or retry, so a slow or
rate-limited external API could hang an agent turn with no recovery.

New shared resilientFetch (@objectstack/spec/shared): 30s per-attempt timeout
(AbortController) + exponential backoff w/ jitter (3 tries) on network errors /
429 / 5xx, honouring Retry-After; never retries a caller-initiated abort.

Wired into connector-rest, connector-slack, embedder-openai. connector-mcp uses
the MCP SDK transport, so it gets a 30s per-request timeout on callTool/listTools
instead. (Also added the missing @objectstack/spec/shared alias to embedder's
vitest config so the subpath value-import resolves.)

Circuit breaker deliberately deferred (stateful/per-host); timeout+backoff
already removes the hang/no-recovery risk. +13 tests (helper 9, connector retry 1,
existing suites green). docs/launch-readiness.md P1-1 updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 5, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
spec Ready Ready Preview, Comment Jun 5, 2026 2:06am

Request Review

@github-actions github-actions Bot added documentation Improvements or additions to documentation tests tooling size/m labels Jun 5, 2026
@xuyushun441-sys xuyushun441-sys merged commit d5a8161 into main Jun 5, 2026
12 checks passed
@xuyushun441-sys xuyushun441-sys deleted the fix/external-call-resilience branch June 5, 2026 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/m tests tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants