Skip to content

Leads 233 retry api call incase of error#173

Merged
asamal4 merged 3 commits intolightspeed-core:mainfrom
xmican10:LEADS-233-retry-api-call-incase-of-error
Feb 28, 2026
Merged

Leads 233 retry api call incase of error#173
asamal4 merged 3 commits intolightspeed-core:mainfrom
xmican10:LEADS-233-retry-api-call-incase-of-error

Conversation

@xmican10
Copy link
Contributor

@xmican10 xmican10 commented Feb 26, 2026

Description

  • After reaching lightspeed-stack internal request quota, 429 error is being rased
  • 429 error is caught for both /streaming and /query endpoints and retried, using tenacity library and its exponential wait
  • retry_attempts parameter was added to system.yaml for handling number of retries, default set to 10
  • In case the retry_attempts are reached and 429 persists, APIError is rased with a message specifying that the number of retries was reached
  • Verified for OpenAI and VertexAI provider
  • Unit tests added + unit tests refactored to use common conftest.py APIConfigs

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features
    • Configurable retry for API rate-limit (429) responses (default 3 attempts) for both standard and streaming requests; caching consulted before queries.
  • Bug Fixes
    • Clearer error shown when retry attempts are exhausted; improved timeout and retry behavior for rate limiting.
  • Tests
    • Expanded tests covering retry/rate-limit scenarios and updated fixtures.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be31217 and a1dfbbc.

📒 Files selected for processing (5)
  • config/system.yaml
  • src/lightspeed_evaluation/core/api/client.py
  • src/lightspeed_evaluation/core/constants.py
  • src/lightspeed_evaluation/core/models/system.py
  • tests/unit/core/api/test_client.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lightspeed_evaluation/core/constants.py

Walkthrough

Adds configurable retry-on-429 behavior (default 3) to the API client, centralizes client configuration via APIConfig, integrates disk-cache checks/updates into query flow, updates constructor signature to accept APIConfig, and updates tests/fixtures to validate retry detection, retry flows, and exhaustion handling.

Changes

Cohort / File(s) Summary
Configuration
config/system.yaml
Adds num_retries: 3 under the Lightspeed-stack API section to configure retries for 429 responses.
Constants
src/lightspeed_evaluation/core/constants.py
Adds DEFAULT_API_NUM_RETRIES = 3.
API Config Model
src/lightspeed_evaluation/core/models/system.py
Adds num_retries: int to APIConfig (default DEFAULT_API_NUM_RETRIES, ge=0) with metadata describing retry behavior for 429 responses.
Core API Client
src/lightspeed_evaluation/core/api/client.py
Adds _is_too_many_requests_error and _create_retry_decorator (tenacity) and applies retry-wrapped methods for standard and streaming queries; centralizes use of self.config (api_base, version, endpoint_type, timeout, num_retries); integrates disk-cache consult/update; re-raises 429 for retries and maps retry exhaustion to APIError; constructor now accepts APIConfig and reads API key from environment.
Tests: fixtures & client tests
tests/unit/core/api/conftest.py, tests/unit/core/api/test_client.py
Renames fixtures to basic_api_config_query_endpoint / basic_api_config_streaming_endpoint; updates tests to import and exercise _is_too_many_requests_error; adds tests for retry detection, standard and streaming retry flows, and retry exhaustion; updates tests to use client.config fields.

Sequence Diagram(s)

sequenceDiagram
    participant Client as APIClient
    participant Config as APIConfig
    participant Cache as Disk Cache
    participant Decorator as Retry Decorator
    participant API as HTTP API

    Client->>Config: read api_base, version, timeout, num_retries
    Client->>Cache: init/check (if cache_enabled)
    Client->>Decorator: call query via retry wrapper
    Decorator->>Cache: check cache for request (if applicable)
    alt cached
        Cache-->>Decorator: return cached response
        Decorator-->>Client: return APIResponse
    else not cached
        Decorator->>API: POST request
        alt 429 Too Many Requests
            API-->>Decorator: 429 response
            Decorator->>Decorator: backoff & retry (up to num_retries)
            Decorator->>API: retry POST request
            API-->>Decorator: success (200) or final 429
        else Other HTTP Error
            API-->>Decorator: non-429 error
            Decorator-->>Client: raise APIError
        else Success
            API-->>Decorator: 200 response
        end
        Decorator->>Cache: update cache (if enabled)
        Decorator-->>Client: return APIResponse
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title references LEADS-233 (the issue/feature number) but is unclear about the main change: it says 'retry api call incase of error' which is vague about the specific error type (429) and lacks clarity on whether this is about adding retry logic or fixing existing retry behavior. Revise the title to be more specific, such as 'Add 429 retry logic with exponential backoff to API client' or 'Implement retry mechanism for rate-limiting 429 errors', removing the issue number prefix for clarity.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 97.44% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/api/client.py`:
- Around line 64-67: The configured retry_attempts is being passed directly to
tenacity.stop_after_attempt which treats the value as total attempts (initial
call + retries), causing an off-by-one; update the retry policy in the
retry(...) call to stop after self.config.retry_attempts + 1 attempts (i.e.,
convert configured retries to total attempts) so the semantics match the
retry_attempts name, and adjust any related tests/assertions accordingly;
reference the retry(...) invocation and the _is_too_many_requests_error
predicate and use self.config.retry_attempts when computing the +1.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e395459 and d0a74b5.

📒 Files selected for processing (6)
  • config/system.yaml
  • src/lightspeed_evaluation/core/api/client.py
  • src/lightspeed_evaluation/core/constants.py
  • src/lightspeed_evaluation/core/models/system.py
  • tests/unit/core/api/conftest.py
  • tests/unit/core/api/test_client.py

@xmican10 xmican10 force-pushed the LEADS-233-retry-api-call-incase-of-error branch from ca74a80 to 5d322d4 Compare February 26, 2026 10:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/unit/core/api/test_client.py (1)

517-608: Neutralize tenacity backoff delays in retry tests for faster, more deterministic execution.

The retry decorator applies wait_exponential(multiplier=1, min=4, max=100) to decorated methods (lines 60–61 in client.py), causing real sleep delays (minimum 4 seconds per retry) when tests trigger 429 error paths. Patch tenacity's wait strategy to wait_none() in the three affected tests to eliminate these delays while preserving retry count assertions:

  • test_standard_query_retries_on_429_then_succeeds
  • test_streaming_query_retries_on_429_then_succeeds
  • test_query_raises_api_error_after_max_retries

Example: mocker.patch("lightspeed_evaluation.core.api.client.wait_exponential", return_value=tenacity.wait_none())

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/core/api/test_client.py` around lines 517 - 608, The retry backoff
in APIClient (decorated with wait_exponential) causes real delays in the three
tests; patch the wait strategy to tenacity.wait_none() at the start of each test
to neutralize sleeping: in test_standard_query_retries_on_429_then_succeeds,
test_streaming_query_retries_on_429_then_succeeds, and
test_query_raises_api_error_after_max_retries use
mocker.patch("lightspeed_evaluation.core.api.client.wait_exponential",
return_value=tenacity.wait_none()) so the decorated methods (the retry behavior
on 429) still run but without real backoff delays, keeping retry count
assertions intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/api/client.py`:
- Line 66: The inline trailing comment on the call to stop_after_attempt in the
client retry configuration (the line containing
stop=stop_after_attempt(self.config.retry_attempts + 1)) exceeds
line-length/Black rules; move the explanatory comment to its own preceding line
or shorten it so the call line fits the formatter (for example place a
standalone comment like "# +1 to account for the initial attempt" above the
stop=... expression) and ensure you keep the exact expression
stop=stop_after_attempt(self.config.retry_attempts + 1) unchanged.

---

Nitpick comments:
In `@tests/unit/core/api/test_client.py`:
- Around line 517-608: The retry backoff in APIClient (decorated with
wait_exponential) causes real delays in the three tests; patch the wait strategy
to tenacity.wait_none() at the start of each test to neutralize sleeping: in
test_standard_query_retries_on_429_then_succeeds,
test_streaming_query_retries_on_429_then_succeeds, and
test_query_raises_api_error_after_max_retries use
mocker.patch("lightspeed_evaluation.core.api.client.wait_exponential",
return_value=tenacity.wait_none()) so the decorated methods (the retry behavior
on 429) still run but without real backoff delays, keeping retry count
assertions intact.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0a74b5 and ca74a80.

📒 Files selected for processing (2)
  • src/lightspeed_evaluation/core/api/client.py
  • tests/unit/core/api/test_client.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/core/api/test_client.py`:
- Around line 517-520: The docstring for the test function
test_standard_query_retries_on_429_then_succeeds contains a typo: it uses "than"
where it should be "then"; update the docstring in
test_standard_query_retries_on_429_then_succeeds (and the other similar
docstring around line 550) to replace "than" with "then" so the sentence reads
"...retries on 429 error and then succeeds on retry."

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca74a80 and 5d322d4.

📒 Files selected for processing (2)
  • src/lightspeed_evaluation/core/api/client.py
  • tests/unit/core/api/test_client.py

@xmican10 xmican10 force-pushed the LEADS-233-retry-api-call-incase-of-error branch from e7bf583 to 69b86c3 Compare February 26, 2026 10:19
Copy link
Member

@VladimirKadlec VladimirKadlec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work 💪

Copy link
Collaborator

@asamal4 asamal4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you !!
some minor comments..

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
tests/unit/core/api/test_client.py (1)

547-550: ⚠️ Potential issue | 🟡 Minor

Fix the remaining docstring typo.

Line 550 still uses “than” where “then” is intended.

✏️ Proposed fix
-        """Test streaming query retries on 429 error and than succeeds on retry."""
+        """Test streaming query retries on 429 error and then succeeds on retry."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/core/api/test_client.py` around lines 547 - 550, Fix the docstring
typo in the test_streaming_query_retries_on_429_then_succeeds function: change
the word "than" to "then" so the docstring reads "retries on 429 error and then
succeeds on retry." Ensure the updated text is in the function's docstring line.
src/lightspeed_evaluation/core/api/client.py (1)

123-136: ⚠️ Potential issue | 🟠 Major

Cache lookups can return incorrect responses across conversations/endpoints.

Line 123 and Line 134 now enable cache read/write for all queries, but the cache key does not include conversation_id, endpoint_type, or API version. This can serve stale/wrong responses for follow-up turns or different endpoint modes sharing the same cache directory.

💡 Proposed fix
 def _get_cache_key(self, request: APIRequest) -> str:
     """Get cache key for the query."""
     # Note, python hash is initialized randomly so can't be used here
     request_dict = request.model_dump()
-    keys_to_hash = [
-        "query",
-        "provider",
-        "model",
-        "no_tools",
-        "system_prompt",
-        "attachments",
-    ]
-    str_request = ",".join([str(request_dict[k]) for k in keys_to_hash])
+    key_payload = {
+        "query": request_dict.get("query"),
+        "provider": request_dict.get("provider"),
+        "model": request_dict.get("model"),
+        "no_tools": request_dict.get("no_tools"),
+        "system_prompt": request_dict.get("system_prompt"),
+        "attachments": request_dict.get("attachments"),
+        "conversation_id": request_dict.get("conversation_id"),
+        "endpoint_type": self.config.endpoint_type,
+        "api_version": self.config.version,
+    }
+    str_request = json.dumps(key_payload, sort_keys=True, default=str)
 
     return hashlib.sha256(str_request.encode()).hexdigest()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/api/client.py` around lines 123 - 136, The
cache lookup/write in methods _get_cached_response and _add_response_to_cache
currently uses a key that can collide across conversations and endpoint modes;
update the cache key generation to incorporate unique request context such as
api_request.conversation_id (or conversation_id if separate),
self.config.endpoint_type, and the API version (e.g., self.config.api_version or
api_request.api_version) so that cached entries are namespaced by conversation
and endpoint/version, and use that enhanced key both when reading (in
_get_cached_response) and writing (in _add_response_to_cache) to prevent
returning stale or cross-conversation responses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/lightspeed_evaluation/core/api/client.py`:
- Around line 123-136: The cache lookup/write in methods _get_cached_response
and _add_response_to_cache currently uses a key that can collide across
conversations and endpoint modes; update the cache key generation to incorporate
unique request context such as api_request.conversation_id (or conversation_id
if separate), self.config.endpoint_type, and the API version (e.g.,
self.config.api_version or api_request.api_version) so that cached entries are
namespaced by conversation and endpoint/version, and use that enhanced key both
when reading (in _get_cached_response) and writing (in _add_response_to_cache)
to prevent returning stale or cross-conversation responses.

In `@tests/unit/core/api/test_client.py`:
- Around line 547-550: Fix the docstring typo in the
test_streaming_query_retries_on_429_then_succeeds function: change the word
"than" to "then" so the docstring reads "retries on 429 error and then succeeds
on retry." Ensure the updated text is in the function's docstring line.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d322d4 and be31217.

📒 Files selected for processing (6)
  • config/system.yaml
  • src/lightspeed_evaluation/core/api/client.py
  • src/lightspeed_evaluation/core/constants.py
  • src/lightspeed_evaluation/core/models/system.py
  • tests/unit/core/api/conftest.py
  • tests/unit/core/api/test_client.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lightspeed_evaluation/core/models/system.py

@xmican10 xmican10 force-pushed the LEADS-233-retry-api-call-incase-of-error branch from be31217 to a1dfbbc Compare February 27, 2026 11:31
@asamal4 asamal4 merged commit 07629b6 into lightspeed-core:main Feb 28, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants