Feat/ai rate limiting standard headers by iakuf · Pull Request #13049 · apache/apisix

iakuf · 2026-02-28T10:09:23Z

PR: feat(ai-rate-limiting): add `standard_headers` option for OpenAI/OpenRouter-compatible rate-limit headers

Summary

Add a standard_headers boolean option to the ai-rate-limiting plugin.
When enabled, the plugin emits rate-limit response headers that follow the
OpenAI / OpenRouter convention, allowing IDE extensions (Cursor, Continue, etc.)
to detect quota exhaustion and apply automatic back-off without any custom
client-side configuration.

Issue / Motivation

The current ai-rate-limiting plugin outputs headers in the format:

X-AI-RateLimit-Limit-{instance_name}
X-AI-RateLimit-Remaining-{instance_name}
X-AI-RateLimit-Reset-{instance_name}

This format is APISIX-specific and not recognized by popular AI IDE extensions
such as Cursor and Continue. These tools look for the OpenAI/OpenRouter
standard headers:

X-RateLimit-Limit-Tokens
X-RateLimit-Remaining-Tokens
X-RateLimit-Reset-Tokens

Without these headers, IDE extensions cannot detect that they are being
rate-limited and will keep retrying immediately, causing a poor developer
experience and wasting quota.

Changes

`apisix/plugins/ai-rate-limiting.lua`

Added standard_headers field to the JSON Schema (boolean, default false).
In transform_limit_conf(): when standard_headers is true, the
limit_header, remaining_header, and reset_header fields passed to
limit-count are set to the standard names with a suffix derived from
limit_strategy:

limit_strategy Suffix

total_tokens Tokens

prompt_tokens PromptTokens

completion_tokens CompletionTokens
When standard_headers is false (default), the original
X-AI-RateLimit-*-{instance_name} headers are used — fully backward
compatible.

New / updated files

File	Description
`apisix/plugins/ai-rate-limiting.lua`	Core change
`t/plugin/ai-rate-limiting-standard-headers.t`	Test::Nginx test suite
`docs/en/latest/plugins/ai-rate-limiting.md`	Documentation update (see patch file)

Test Cases

The new test file t/plugin/ai-rate-limiting-standard-headers.t covers:

Schema check — standard_headers: true is accepted by check_schema.
Schema default — standard_headers defaults to false.
Standard headers present — a normal request with standard_headers: true
returns all three X-RateLimit-*-Tokens headers with numeric values.
429 Remaining = 0 — when the quota is exhausted the 429 response carries
X-RateLimit-Remaining-Tokens: 0.
prompt_tokens suffix — limit_strategy: prompt_tokens produces
X-RateLimit-*-PromptTokens headers.
completion_tokens suffix — limit_strategy: completion_tokens produces
X-RateLimit-*-CompletionTokens headers.
Backward compatibility — standard_headers: false still produces the
legacy X-AI-RateLimit-*-{instance_name} headers.

Running the tests locally

# Copy sources to Linux filesystem (required for unix socket support)
rm -rf /tmp/apisix-test
cp -r /path/to/apisix /tmp/apisix-test

# Run the new test file
docker run --rm --user root \
  -v /tmp/apisix-test:/usr/local/apisix/apisix-src \
  apache/apisix:3.15.0-debian bash -c '
    apt-get update -qq && apt-get install -y --no-install-recommends cpanminus git make libwww-perl &&
    cpanm --notest Test::Nginx &&
    git clone --depth=1 https://github.com/openresty/test-nginx.git /test-nginx &&
    ln -sf /usr/local/apisix/deps /usr/local/apisix/apisix-src/deps &&
    cd /usr/local/apisix/apisix-src &&
    APISIX_HOME=/usr/local/apisix/apisix-src TEST_NGINX_BINARY=/usr/bin/openresty \
    prove -I/test-nginx/lib -I./ t/plugin/ai-rate-limiting-standard-headers.t
  '

Documentation

See docs/en/latest/plugins/ai-rate-limiting-standard-headers-patch.md for the
parameter reference table, configuration example, and sample response headers.

Checklist

New feature is backward compatible (standard_headers defaults to false)
JSON Schema updated with new field
Test::Nginx tests added (7 test cases)
Documentation written
CHANGELOG entry (to be added before merge)
CI passes

When using openai-compatible provider with Anthropic-format endpoints (e.g. DeepSeek's /anthropic/v1/messages), the response returns input_tokens/output_tokens instead of prompt_tokens/completion_tokens. This patch adds fallback support for both field names in both streaming and non-streaming paths, so token usage statistics work correctly regardless of which format the upstream LLM returns. Fixes token stats being 0 when proxying to Anthropic-compatible endpoints.

…streaming path

…uter-compatible rate-limit headers

iakuf added 4 commits February 26, 2026 22:10

fix(ai-proxy): compute total_tokens fallback for Anthropic format in …

03608d9

…streaming path

fix(ai-proxy): also compute total_tokens fallback in non-streaming path

d88f360

feat(ai-rate-limiting): add standard_headers option for OpenAI/OpenRo…

526eee2

…uter-compatible rate-limit headers

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Feb 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/ai rate limiting standard headers#13049

Feat/ai rate limiting standard headers#13049
iakuf wants to merge 4 commits intoapache:masterfrom
iakuf:feat/ai-rate-limiting-standard-headers

iakuf commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`limit_strategy`	Suffix
`total_tokens`	`Tokens`
`prompt_tokens`	`PromptTokens`
`completion_tokens`	`CompletionTokens`

Conversation

iakuf commented Feb 28, 2026

PR: feat(ai-rate-limiting): add standard_headers option for OpenAI/OpenRouter-compatible rate-limit headers

Summary

Issue / Motivation

Changes

apisix/plugins/ai-rate-limiting.lua

New / updated files

Test Cases

Running the tests locally

Documentation

Checklist

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PR: feat(ai-rate-limiting): add `standard_headers` option for OpenAI/OpenRouter-compatible rate-limit headers

`apisix/plugins/ai-rate-limiting.lua`