You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#71 fixed cached-token double-counting in displayed input totals for the OpenAI passthrough (OpenAI-format APIs report input_tokensinclusive of cached; the proxy uses the Anthropic cache-exclusive convention). The fix is correct for new records and for the live get_usage_stats path, but it does not retroactively correct already-stored per-key totals shown in the admin portal.
Why
The admin portal per-key totals come from usage_stats_manager.get_stats() — a stored running total in the anthropic-proxy-usage-stats table, not a live computation.
UsageStatsManager.aggregate_all_usage() maintains that total incrementally: it reads only usage records newer than last_aggregated_timestamp and adds the deltas via increment_stats (app/db/dynamodb.py:1796-1806). Only the first run for a key (no stored row) does a full from-scratch recompute via update_stats (:1815-1825).
Consequently, the historical inflated total_input_tokens already accumulated into a key's stored row is never recomputed — it stays baked in. The displayed input remains too high until the row is reset and rebuilt.
Observed
A user comparing Codex vs the admin portal for a key in us-west-2:
input
output
cached
Codex
4.06M
34.3K
—
Admin portal
7.996M
36.8K
3.763M
7.996M − 3.763M (cached) ≈ 4.06M — input inflated by exactly one cached-amount.
Which keys are affected? Any key with OpenAI-passthrough usage recorded before fix: don't double-count cached tokens in displayed input totals #71 deployed (records tagged metadata.input_tokens_include_cached_tokens=True). Consider a scoped --all re-aggregation for the affected deployment rather than per-key.
budget_used_mtd on recompute: the first-run branch sets MTD from the full all-time cost, which can overstate month-to-date for keys with history. Cost itself was already correct (cost calc subtracts cached), so this only affects the MTD bucket after a manual reset. Decide whether the reaggregate tool should preserve/skip MTD.
Prevention: consider whether aggregate_all_usage should support a periodic full recompute (or a schema/version marker) so future aggregation-logic fixes self-heal instead of requiring manual backfill.
Summary
#71 fixed cached-token double-counting in displayed input totals for the OpenAI passthrough (OpenAI-format APIs report
input_tokensinclusive of cached; the proxy uses the Anthropic cache-exclusive convention). The fix is correct for new records and for the liveget_usage_statspath, but it does not retroactively correct already-stored per-key totals shown in the admin portal.Why
The admin portal per-key totals come from
usage_stats_manager.get_stats()— a stored running total in theanthropic-proxy-usage-statstable, not a live computation.UsageStatsManager.aggregate_all_usage()maintains that total incrementally: it reads only usage records newer thanlast_aggregated_timestampand adds the deltas viaincrement_stats(app/db/dynamodb.py:1796-1806). Only the first run for a key (no stored row) does a full from-scratch recompute viaupdate_stats(:1815-1825).Consequently, the historical inflated
total_input_tokensalready accumulated into a key's stored row is never recomputed — it stays baked in. The displayed input remains too high until the row is reset and rebuilt.Observed
A user comparing Codex vs the admin portal for a key in us-west-2:
7.996M − 3.763M (cached) ≈ 4.06M— input inflated by exactly one cached-amount.Remediation
scripts/reaggregate_key.py): deletes the usage-stats row, resetsbudget_used, and runsaggregate_all_usage.Open questions / follow-ups
metadata.input_tokens_include_cached_tokens=True). Consider a scoped--allre-aggregation for the affected deployment rather than per-key.budget_used_mtdon recompute: the first-run branch sets MTD from the full all-time cost, which can overstate month-to-date for keys with history. Cost itself was already correct (cost calc subtracts cached), so this only affects the MTD bucket after a manual reset. Decide whether the reaggregate tool should preserve/skip MTD.aggregate_all_usageshould support a periodic full recompute (or a schema/version marker) so future aggregation-logic fixes self-heal instead of requiring manual backfill.Refs: #71 (fix), #72 (reaggregate tooling).