ai written but based on an error we observed in production logs, edited lightly by me
The WorkOS Python SDK has no retry logic for transient failures. When a WorkOS API call encounters a timeout or transient server error, it fails immediately on the first attempt. This is especially problematic for operations like authenticate_with_refresh_token, which are inherently idempotent and safe to retry.
We hit this in production today. Our call to authenticate_with_refresh_token connected to api.workos.com successfully, but WorkOS never sent response headers. After 25 seconds (the SDK's DEFAULT_REQUEST_TIMEOUT), httpx.ReadTimeout was raised and our user's auth flow was broken.
A single automatic retry would have resolved this transparently.
Current behavior
AsyncHTTPClient.request() in workos/utils/http_client.py makes a single request with no retry:
|
response = await self._client.request(**prepared_request_parameters) |
Transient failures — httpx.TimeoutException, httpx.ConnectError, HTTP 429, HTTP 5xx — all fail immediately.
Additionally, these transport-level exceptions are not caught or wrapped in WorkOS exception types, so they bubble up as raw httpx errors. Consumers catching workos.exceptions.BaseRequestException won't catch timeouts.
Expected behavior
Comparison to other auth/identity SDKs:
| SDK |
Default retries |
Retryable conditions |
| Auth0 Python |
2 |
408, 429, 5xx, connection errors |
| AWS SDKs |
3 |
429, 5xx, connection errors |
| WorkOS Python |
0 |
None |
A reasonable default would be 2-3 retries with exponential backoff for:
httpx.TimeoutException and httpx.ConnectError
- HTTP 429 (respecting
Retry-After header)
- HTTP 500, 502, 503, 504
This could be implemented via httpx's transport-level retry or a simple retry loop in request().
Workaround
Wrapping SDK calls with our own try/except httpx.ReadTimeout, but this requires knowing about httpx internals — the SDK's exception hierarchy should abstract transport errors away.
Environment
workos v5.45.0
httpx v0.28.1
- Python 3.14
- Async client (
AsyncHTTPClient)
ai written but based on an error we observed in production logs, edited lightly by me
The WorkOS Python SDK has no retry logic for transient failures. When a WorkOS API call encounters a timeout or transient server error, it fails immediately on the first attempt. This is especially problematic for operations like
authenticate_with_refresh_token, which are inherently idempotent and safe to retry.We hit this in production today. Our call to
authenticate_with_refresh_tokenconnected toapi.workos.comsuccessfully, but WorkOS never sent response headers. After 25 seconds (the SDK'sDEFAULT_REQUEST_TIMEOUT),httpx.ReadTimeoutwas raised and our user's auth flow was broken.A single automatic retry would have resolved this transparently.
Current behavior
AsyncHTTPClient.request()inworkos/utils/http_client.pymakes a single request with no retry:workos-python/src/workos/utils/http_client.py
Line 237 in 588850d
Transient failures —
httpx.TimeoutException,httpx.ConnectError, HTTP 429, HTTP 5xx — all fail immediately.Additionally, these transport-level exceptions are not caught or wrapped in WorkOS exception types, so they bubble up as raw httpx errors. Consumers catching
workos.exceptions.BaseRequestExceptionwon't catch timeouts.Expected behavior
Comparison to other auth/identity SDKs:
A reasonable default would be 2-3 retries with exponential backoff for:
httpx.TimeoutExceptionandhttpx.ConnectErrorRetry-Afterheader)This could be implemented via httpx's transport-level retry or a simple retry loop in
request().Workaround
Wrapping SDK calls with our own
try/except httpx.ReadTimeout, but this requires knowing about httpx internals — the SDK's exception hierarchy should abstract transport errors away.Environment
workosv5.45.0httpxv0.28.1AsyncHTTPClient)