fix: honor batch rate limit under async indexing (#1542) by shaikn6 · Pull Request #2081 · weaviate/weaviate-python-client

shaikn6 · 2026-06-25T14:44:22Z

What & why

collection.batch.rate_limit(...) was not honored when the Weaviate server runs in async-indexing mode. Batches were sent with 1000 objects regardless of the configured requests-per-minute, defeating the rate limiter exactly as reported in the issue.

Root cause

Dynamic batching sizes its batches from the server-reported indexing queue. Under async indexing the server does not return batchStats, so the background loop in _BatchBase.__dynamic_batching takes its "no queue feedback" fallback:

# async indexing - just send a lot
self.__batching_mode = _FixedSizeBatching(1000, 10)
self.__recommended_num_objects = 1000
self.__concurrent_requests = 10

This fallback overwrote the batching mode unconditionally, so a rate limit that the user had configured was silently dropped and replaced with fixed-size 1000/10 batching — the "1000 objects no matter the rate" behavior in the report.

The fix

The fallback decision is extracted into a small, side-effect-free helper, _async_indexing_batch_params, which:

preserves a configured _RateLimitedBatching mode (reusing the same requests-per-minute math as the constructor), and
keeps the existing large-batch behavior (_FixedSizeBatching(1000, 10)) for dynamic batching when no rate limit was set.

The originally requested batch mode is tracked separately (__requested_batch_mode) so the fallback can tell whether a rate limit was configured, since __batching_mode is reassigned at runtime. The rate-limited timing path itself in __batch_send is unchanged. No public API signatures change.

Testing

Added unit tests in test/collection/test_batch.py that exercise the decision helper directly, so they need no live Weaviate server:

test_async_indexing_preserves_rate_limit — a configured rate_limit(100) is preserved (not replaced by 1000/10)
test_async_indexing_rate_limit_spans_multiple_batches — a limit above the max batch size is split across batches (3000/min → 4 × 750)
test_async_indexing_dynamic_falls_back_to_fixed_size — regression guard: dynamic batching without a rate limit keeps the _FixedSizeBatching(1000, 10) fallback

The "preserve rate limit" tests fail on main (the helper does not exist / the old branch hardcodes 1000/10) and pass with this change.

Ran locally (Python 3.12, fresh venv from requirements-devel.txt + requirements-test.txt):

pytest test/ → 379 passed, 1 skipped
ruff format --check → clean
flake8 (incl. flake8-docstrings + pydoclint plugin) → clean
pyright → no errors in base.py (the 4 pre-existing connect/v4.py authlib errors are present on unmodified main)
pydoclint → no findings for the added functions

I could not run the integration/mock suites, which require a live Weaviate instance via Docker; this change is covered by the server-free unit tests above.

CLA

I understand contributions require a signed Contributor License Agreement per CONTRIBUTING.md, and I'll complete the DocuSign step from the link the bot posts on this PR.

When the server runs in async-indexing mode it does not report batchStats, so dynamic batching cannot tune itself against the indexing queue and falls back to large fixed-size batches. That fallback unconditionally overwrote the batching mode with _FixedSizeBatching(1000, 10), silently discarding a rate limit configured via collection.batch.rate_limit(...). As a result batches contained 1000 objects regardless of the configured requests-per-minute. Extract the fallback decision into _async_indexing_batch_params, which keeps the large-batch behavior for dynamic batching but preserves a configured rate limit. Track the originally requested batch mode so the fallback can tell whether a rate limit was set. Add unit tests covering both paths.

orca-security-eu

Orca Security Scan Summary

Status	Check	Issues by priority
Passed	Infrastructure as Code	0 0 0 0	View in Orca
Passed	SAST	0 0 0 0	View in Orca
Passed	Secrets	0 0 0 0	View in Orca
Passed	Vulnerabilities	0 0 0 0	View in Orca

weaviate-git-bot · 2026-06-25T15:00:44Z

To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge.

beep boop - the Weaviate bot 👋🤖

PS:
Are you already a member of the Weaviate Forum?

shaikn6 · 2026-06-25T16:24:48Z

I have read and agree to the Weaviate Contributor License Agreement.

orca-security-eu Bot reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: honor batch rate limit under async indexing (#1542)#2081

fix: honor batch rate limit under async indexing (#1542)#2081
shaikn6 wants to merge 1 commit into
weaviate:mainfrom
shaikn6:fix/issue-1542-rate-limit-async-indexing

shaikn6 commented Jun 25, 2026

Uh oh!

orca-security-eu Bot left a comment

Uh oh!

weaviate-git-bot commented Jun 25, 2026

Uh oh!

shaikn6 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shaikn6 commented Jun 25, 2026

What & why

Root cause

The fix

Testing

CLA

Uh oh!

orca-security-eu Bot left a comment

Choose a reason for hiding this comment

Orca Security Scan Summary

Uh oh!

weaviate-git-bot commented Jun 25, 2026

Uh oh!

shaikn6 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants