Leads 233 retry api call incase of error by xmican10 · Pull Request #173 · lightspeed-core/lightspeed-evaluation

xmican10 · 2026-02-26T09:26:37Z

Description

After reaching lightspeed-stack internal request quota, 429 error is being rased
429 error is caught for both /streaming and /query endpoints and retried, using tenacity library and its exponential wait
retry_attempts parameter was added to system.yaml for handling number of retries, default set to 10
In case the retry_attempts are reached and 429 persists, APIError is rased with a message specifying that the number of retries was reached
Verified for OpenAI and VertexAI provider
Unit tests added + unit tests refactored to use common conftest.py APIConfigs

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Configurable retry for API rate-limit (429) responses (default 3 attempts) for both standard and streaming requests; caching consulted before queries.
Bug Fixes
- Clearer error shown when retry attempts are exhausted; improved timeout and retry behavior for rate limiting.
Tests
- Expanded tests covering retry/rate-limit scenarios and updated fixtures.

coderabbitai · 2026-02-26T09:26:54Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be31217 and a1dfbbc.

📒 Files selected for processing (5)

config/system.yaml
src/lightspeed_evaluation/core/api/client.py
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/models/system.py
tests/unit/core/api/test_client.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/lightspeed_evaluation/core/constants.py

Walkthrough

Adds configurable retry-on-429 behavior (default 3) to the API client, centralizes client configuration via APIConfig, integrates disk-cache checks/updates into query flow, updates constructor signature to accept APIConfig, and updates tests/fixtures to validate retry detection, retry flows, and exhaustion handling.

Changes

Cohort / File(s)	Summary
Configuration `config/system.yaml`	Adds `num_retries: 3` under the Lightspeed-stack API section to configure retries for 429 responses.
Constants `src/lightspeed_evaluation/core/constants.py`	Adds `DEFAULT_API_NUM_RETRIES = 3`.
API Config Model `src/lightspeed_evaluation/core/models/system.py`	Adds `num_retries: int` to `APIConfig` (default `DEFAULT_API_NUM_RETRIES`, `ge=0`) with metadata describing retry behavior for 429 responses.
Core API Client `src/lightspeed_evaluation/core/api/client.py`	Adds `_is_too_many_requests_error` and `_create_retry_decorator` (tenacity) and applies retry-wrapped methods for standard and streaming queries; centralizes use of `self.config` (api_base, version, endpoint_type, timeout, num_retries); integrates disk-cache consult/update; re-raises 429 for retries and maps retry exhaustion to `APIError`; constructor now accepts `APIConfig` and reads API key from environment.
Tests: fixtures & client tests `tests/unit/core/api/conftest.py`, `tests/unit/core/api/test_client.py`	Renames fixtures to `basic_api_config_query_endpoint` / `basic_api_config_streaming_endpoint`; updates tests to import and exercise `_is_too_many_requests_error`; adds tests for retry detection, standard and streaming retry flows, and retry exhaustion; updates tests to use `client.config` fields.

Sequence Diagram(s)

sequenceDiagram
    participant Client as APIClient
    participant Config as APIConfig
    participant Cache as Disk Cache
    participant Decorator as Retry Decorator
    participant API as HTTP API

    Client->>Config: read api_base, version, timeout, num_retries
    Client->>Cache: init/check (if cache_enabled)
    Client->>Decorator: call query via retry wrapper
    Decorator->>Cache: check cache for request (if applicable)
    alt cached
        Cache-->>Decorator: return cached response
        Decorator-->>Client: return APIResponse
    else not cached
        Decorator->>API: POST request
        alt 429 Too Many Requests
            API-->>Decorator: 429 response
            Decorator->>Decorator: backoff & retry (up to num_retries)
            Decorator->>API: retry POST request
            API-->>Decorator: success (200) or final 429
        else Other HTTP Error
            API-->>Decorator: non-429 error
            Decorator-->>Client: raise APIError
        else Success
            API-->>Decorator: 200 response
        end
        Decorator->>Cache: update cache (if enabled)
        Decorator-->>Client: return APIResponse
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title references LEADS-233 (the issue/feature number) but is unclear about the main change: it says 'retry api call incase of error' which is vague about the specific error type (429) and lacks clarity on whether this is about adding retry logic or fixing existing retry behavior.	Revise the title to be more specific, such as 'Add 429 retry logic with exponential backoff to API client' or 'Implement retry mechanism for rate-limiting 429 errors', removing the issue number prefix for clarity.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 97.44% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/api/client.py`:
- Around line 64-67: The configured retry_attempts is being passed directly to
tenacity.stop_after_attempt which treats the value as total attempts (initial
call + retries), causing an off-by-one; update the retry policy in the
retry(...) call to stop after self.config.retry_attempts + 1 attempts (i.e.,
convert configured retries to total attempts) so the semantics match the
retry_attempts name, and adjust any related tests/assertions accordingly;
reference the retry(...) invocation and the _is_too_many_requests_error
predicate and use self.config.retry_attempts when computing the +1.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e395459 and d0a74b5.

📒 Files selected for processing (6)

config/system.yaml
src/lightspeed_evaluation/core/api/client.py
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/models/system.py
tests/unit/core/api/conftest.py
tests/unit/core/api/test_client.py

src/lightspeed_evaluation/core/api/client.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/unit/core/api/test_client.py (1)
517-608: Neutralize tenacity backoff delays in retry tests for faster, more deterministic execution.

The retry decorator applies wait_exponential(multiplier=1, min=4, max=100) to decorated methods (lines 60–61 in client.py), causing real sleep delays (minimum 4 seconds per retry) when tests trigger 429 error paths. Patch tenacity's wait strategy to wait_none() in the three affected tests to eliminate these delays while preserving retry count assertions:

test_standard_query_retries_on_429_then_succeeds

test_streaming_query_retries_on_429_then_succeeds

test_query_raises_api_error_after_max_retries

Example: mocker.patch("lightspeed_evaluation.core.api.client.wait_exponential", return_value=tenacity.wait_none())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/core/api/test_client.py` around lines 517 - 608, The retry backoff
in APIClient (decorated with wait_exponential) causes real delays in the three
tests; patch the wait strategy to tenacity.wait_none() at the start of each test
to neutralize sleeping: in test_standard_query_retries_on_429_then_succeeds,
test_streaming_query_retries_on_429_then_succeeds, and
test_query_raises_api_error_after_max_retries use
mocker.patch("lightspeed_evaluation.core.api.client.wait_exponential",
return_value=tenacity.wait_none()) so the decorated methods (the retry behavior
on 429) still run but without real backoff delays, keeping retry count
assertions intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/api/client.py`:
- Line 66: The inline trailing comment on the call to stop_after_attempt in the
client retry configuration (the line containing
stop=stop_after_attempt(self.config.retry_attempts + 1)) exceeds
line-length/Black rules; move the explanatory comment to its own preceding line
or shorten it so the call line fits the formatter (for example place a
standalone comment like "# +1 to account for the initial attempt" above the
stop=... expression) and ensure you keep the exact expression
stop=stop_after_attempt(self.config.retry_attempts + 1) unchanged.

---

Nitpick comments:
In `@tests/unit/core/api/test_client.py`:
- Around line 517-608: The retry backoff in APIClient (decorated with
wait_exponential) causes real delays in the three tests; patch the wait strategy
to tenacity.wait_none() at the start of each test to neutralize sleeping: in
test_standard_query_retries_on_429_then_succeeds,
test_streaming_query_retries_on_429_then_succeeds, and
test_query_raises_api_error_after_max_retries use
mocker.patch("lightspeed_evaluation.core.api.client.wait_exponential",
return_value=tenacity.wait_none()) so the decorated methods (the retry behavior
on 429) still run but without real backoff delays, keeping retry count
assertions intact.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0a74b5 and ca74a80.

📒 Files selected for processing (2)

src/lightspeed_evaluation/core/api/client.py
tests/unit/core/api/test_client.py

src/lightspeed_evaluation/core/api/client.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/core/api/test_client.py`:
- Around line 517-520: The docstring for the test function
test_standard_query_retries_on_429_then_succeeds contains a typo: it uses "than"
where it should be "then"; update the docstring in
test_standard_query_retries_on_429_then_succeeds (and the other similar
docstring around line 550) to replace "than" with "then" so the sentence reads
"...retries on 429 error and then succeeds on retry."

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca74a80 and 5d322d4.

📒 Files selected for processing (2)

src/lightspeed_evaluation/core/api/client.py
tests/unit/core/api/test_client.py

tests/unit/core/api/test_client.py

…ng configs defined in conftest

VladimirKadlec

LGTM, nice work 💪

asamal4

Thank you !!
some minor comments..

config/system.yaml

src/lightspeed_evaluation/core/api/client.py

src/lightspeed_evaluation/core/constants.py

src/lightspeed_evaluation/core/api/client.py

coderabbitai

♻️ Duplicate comments (2)

tests/unit/core/api/test_client.py (1)

547-550: ⚠️ Potential issue | 🟡 Minor

Fix the remaining docstring typo.

Line 550 still uses “than” where “then” is intended.

✏️ Proposed fix

-        """Test streaming query retries on 429 error and than succeeds on retry."""
+        """Test streaming query retries on 429 error and then succeeds on retry."""

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/core/api/test_client.py` around lines 547 - 550, Fix the docstring
typo in the test_streaming_query_retries_on_429_then_succeeds function: change
the word "than" to "then" so the docstring reads "retries on 429 error and then
succeeds on retry." Ensure the updated text is in the function's docstring line.

src/lightspeed_evaluation/core/api/client.py (1)

123-136: ⚠️ Potential issue | 🟠 Major

Cache lookups can return incorrect responses across conversations/endpoints.

Line 123 and Line 134 now enable cache read/write for all queries, but the cache key does not include conversation_id, endpoint_type, or API version. This can serve stale/wrong responses for follow-up turns or different endpoint modes sharing the same cache directory.

💡 Proposed fix

 def _get_cache_key(self, request: APIRequest) -> str:
     """Get cache key for the query."""
     # Note, python hash is initialized randomly so can't be used here
     request_dict = request.model_dump()
-    keys_to_hash = [
-        "query",
-        "provider",
-        "model",
-        "no_tools",
-        "system_prompt",
-        "attachments",
-    ]
-    str_request = ",".join([str(request_dict[k]) for k in keys_to_hash])
+    key_payload = {
+        "query": request_dict.get("query"),
+        "provider": request_dict.get("provider"),
+        "model": request_dict.get("model"),
+        "no_tools": request_dict.get("no_tools"),
+        "system_prompt": request_dict.get("system_prompt"),
+        "attachments": request_dict.get("attachments"),
+        "conversation_id": request_dict.get("conversation_id"),
+        "endpoint_type": self.config.endpoint_type,
+        "api_version": self.config.version,
+    }
+    str_request = json.dumps(key_payload, sort_keys=True, default=str)
 
     return hashlib.sha256(str_request.encode()).hexdigest()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/api/client.py` around lines 123 - 136, The
cache lookup/write in methods _get_cached_response and _add_response_to_cache
currently uses a key that can collide across conversations and endpoint modes;
update the cache key generation to incorporate unique request context such as
api_request.conversation_id (or conversation_id if separate),
self.config.endpoint_type, and the API version (e.g., self.config.api_version or
api_request.api_version) so that cached entries are namespaced by conversation
and endpoint/version, and use that enhanced key both when reading (in
_get_cached_response) and writing (in _add_response_to_cache) to prevent
returning stale or cross-conversation responses.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/lightspeed_evaluation/core/api/client.py`:
- Around line 123-136: The cache lookup/write in methods _get_cached_response
and _add_response_to_cache currently uses a key that can collide across
conversations and endpoint modes; update the cache key generation to incorporate
unique request context such as api_request.conversation_id (or conversation_id
if separate), self.config.endpoint_type, and the API version (e.g.,
self.config.api_version or api_request.api_version) so that cached entries are
namespaced by conversation and endpoint/version, and use that enhanced key both
when reading (in _get_cached_response) and writing (in _add_response_to_cache)
to prevent returning stale or cross-conversation responses.

In `@tests/unit/core/api/test_client.py`:
- Around line 547-550: Fix the docstring typo in the
test_streaming_query_retries_on_429_then_succeeds function: change the word
"than" to "then" so the docstring reads "retries on 429 error and then succeeds
on retry." Ensure the updated text is in the function's docstring line.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d322d4 and be31217.

📒 Files selected for processing (6)

config/system.yaml
src/lightspeed_evaluation/core/api/client.py
src/lightspeed_evaluation/core/constants.py
src/lightspeed_evaluation/core/models/system.py
tests/unit/core/api/conftest.py
tests/unit/core/api/test_client.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/lightspeed_evaluation/core/models/system.py

Adding retry paremeter for 429 inference server error

be93944

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

src/lightspeed_evaluation/core/api/client.py Outdated Show resolved Hide resolved

xmican10 force-pushed the LEADS-233-retry-api-call-incase-of-error branch from ca74a80 to 5d322d4 Compare February 26, 2026 10:02

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

src/lightspeed_evaluation/core/api/client.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

tests/unit/core/api/test_client.py Outdated Show resolved Hide resolved

Adding unit tests for 429 retry, refactoring of the config usage, usi…

69b86c3

…ng configs defined in conftest

xmican10 force-pushed the LEADS-233-retry-api-call-incase-of-error branch from e7bf583 to 69b86c3 Compare February 26, 2026 10:19

VladimirKadlec approved these changes Feb 26, 2026

View reviewed changes

asamal4 reviewed Feb 27, 2026

View reviewed changes

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

Resolving review comments

a1dfbbc

xmican10 force-pushed the LEADS-233-retry-api-call-incase-of-error branch from be31217 to a1dfbbc Compare February 27, 2026 11:31

asamal4 approved these changes Feb 28, 2026

View reviewed changes

asamal4 merged commit 07629b6 into lightspeed-core:main Feb 28, 2026
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leads 233 retry api call incase of error#173

Leads 233 retry api call incase of error#173
asamal4 merged 3 commits intolightspeed-core:mainfrom
xmican10:LEADS-233-retry-api-call-incase-of-error

xmican10 commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

Reviews paused

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

VladimirKadlec left a comment

Uh oh!

asamal4 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xmican10 commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

VladimirKadlec left a comment

Choose a reason for hiding this comment

Uh oh!

asamal4 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xmican10 commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 26, 2026 •

edited

Loading