Skip to content

Feat/ai rate limiting standard headers#13049

Open
iakuf wants to merge 4 commits intoapache:masterfrom
iakuf:feat/ai-rate-limiting-standard-headers
Open

Feat/ai rate limiting standard headers#13049
iakuf wants to merge 4 commits intoapache:masterfrom
iakuf:feat/ai-rate-limiting-standard-headers

Conversation

@iakuf
Copy link

@iakuf iakuf commented Feb 28, 2026

PR: feat(ai-rate-limiting): add standard_headers option for OpenAI/OpenRouter-compatible rate-limit headers

Summary

Add a standard_headers boolean option to the ai-rate-limiting plugin.
When enabled, the plugin emits rate-limit response headers that follow the
OpenAI / OpenRouter convention, allowing IDE extensions (Cursor, Continue, etc.)
to detect quota exhaustion and apply automatic back-off without any custom
client-side configuration.


Issue / Motivation

The current ai-rate-limiting plugin outputs headers in the format:

X-AI-RateLimit-Limit-{instance_name}
X-AI-RateLimit-Remaining-{instance_name}
X-AI-RateLimit-Reset-{instance_name}

This format is APISIX-specific and not recognized by popular AI IDE extensions
such as Cursor and Continue. These tools look for the OpenAI/OpenRouter
standard headers:

X-RateLimit-Limit-Tokens
X-RateLimit-Remaining-Tokens
X-RateLimit-Reset-Tokens

Without these headers, IDE extensions cannot detect that they are being
rate-limited and will keep retrying immediately, causing a poor developer
experience and wasting quota.


Changes

apisix/plugins/ai-rate-limiting.lua

  • Added standard_headers field to the JSON Schema (boolean, default false).

  • In transform_limit_conf(): when standard_headers is true, the
    limit_header, remaining_header, and reset_header fields passed to
    limit-count are set to the standard names with a suffix derived from
    limit_strategy:

    limit_strategy Suffix
    total_tokens Tokens
    prompt_tokens PromptTokens
    completion_tokens CompletionTokens
  • When standard_headers is false (default), the original
    X-AI-RateLimit-*-{instance_name} headers are used — fully backward
    compatible
    .

New / updated files

File Description
apisix/plugins/ai-rate-limiting.lua Core change
t/plugin/ai-rate-limiting-standard-headers.t Test::Nginx test suite
docs/en/latest/plugins/ai-rate-limiting.md Documentation update (see patch file)

Test Cases

The new test file t/plugin/ai-rate-limiting-standard-headers.t covers:

  1. Schema checkstandard_headers: true is accepted by check_schema.
  2. Schema defaultstandard_headers defaults to false.
  3. Standard headers present — a normal request with standard_headers: true
    returns all three X-RateLimit-*-Tokens headers with numeric values.
  4. 429 Remaining = 0 — when the quota is exhausted the 429 response carries
    X-RateLimit-Remaining-Tokens: 0.
  5. prompt_tokens suffixlimit_strategy: prompt_tokens produces
    X-RateLimit-*-PromptTokens headers.
  6. completion_tokens suffixlimit_strategy: completion_tokens produces
    X-RateLimit-*-CompletionTokens headers.
  7. Backward compatibilitystandard_headers: false still produces the
    legacy X-AI-RateLimit-*-{instance_name} headers.

Running the tests locally

# Copy sources to Linux filesystem (required for unix socket support)
rm -rf /tmp/apisix-test
cp -r /path/to/apisix /tmp/apisix-test

# Run the new test file
docker run --rm --user root \
  -v /tmp/apisix-test:/usr/local/apisix/apisix-src \
  apache/apisix:3.15.0-debian bash -c '
    apt-get update -qq && apt-get install -y --no-install-recommends cpanminus git make libwww-perl &&
    cpanm --notest Test::Nginx &&
    git clone --depth=1 https://github.com/openresty/test-nginx.git /test-nginx &&
    ln -sf /usr/local/apisix/deps /usr/local/apisix/apisix-src/deps &&
    cd /usr/local/apisix/apisix-src &&
    APISIX_HOME=/usr/local/apisix/apisix-src TEST_NGINX_BINARY=/usr/bin/openresty \
    prove -I/test-nginx/lib -I./ t/plugin/ai-rate-limiting-standard-headers.t
  '

Documentation

See docs/en/latest/plugins/ai-rate-limiting-standard-headers-patch.md for the
parameter reference table, configuration example, and sample response headers.


Checklist

  • New feature is backward compatible (standard_headers defaults to false)
  • JSON Schema updated with new field
  • Test::Nginx tests added (7 test cases)
  • Documentation written
  • CHANGELOG entry (to be added before merge)
  • CI passes

Related

iakuf added 4 commits February 26, 2026 22:10
When using openai-compatible provider with Anthropic-format endpoints
(e.g. DeepSeek's /anthropic/v1/messages), the response returns
input_tokens/output_tokens instead of prompt_tokens/completion_tokens.

This patch adds fallback support for both field names in both
streaming and non-streaming paths, so token usage statistics work
correctly regardless of which format the upstream LLM returns.

Fixes token stats being 0 when proxying to Anthropic-compatible endpoints.
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant