Skip to content

[token-consumption] Daily Token Consumption Report - 2026-06-04 #36904

@github-actions

Description

@github-actions

Executive Summary

Over the last 24 hours, agentic workflows in github/gh-aw consumed ~120.6M tokens across 5,588 model-call spans, with usage overwhelmingly input-dominated: ~117.8M input vs ~2.84M output tokens (a ~41:1 input:output ratio). Prompt/context size — not generation length — is the dominant cost driver.

Reliability is clean: companion checks on the errors and logs datasets both returned zero error events / error-level logs in the window.

The headline data-quality issue is structural, not a one-off: every token-bearing span carries null for gh-aw.workflow.name, transaction, and gen_ai.request.model. Token usage is emitted only on span.op:http.client spans (the instrumented model-API calls), while the workflow name lives on a separate gen_ai setup span in the same trace. Per-workflow attribution below was reconstructed by joining token spans to setup spans on trace — it is evidence-based but, by necessity, covers the heaviest traces rather than a complete per-workflow rollup.

Key Metrics

Metric Value
Events analyzed (model-call spans w/ token data) 5,588
Events with token data 5,588 (100% of scope)
Total input tokens 117,805,127
Total output tokens 2,836,882
Total tokens 120,642,009
Unique workflows emitting telemetry ≥ 50 (inventory query capped at 50)
Avg tokens/event 21,615
P95 tokens/event 60,859
Max tokens/event 157,085
Events missing workflow identifier 5,588 (100% — see Data Quality)
Error events / error logs (24h) 0 / 0

Top Token Consumers (workflow rollup of the 12 heaviest traces)

Reconstructed via trace-join. Daily Ambient Context Optimizer recurs 3× in the heaviest traces, making it the clear top aggregate consumer; Smoke Codex recurs 2×. Counts are a lower bound (top-trace sample only), not full per-workflow totals.

Workflow Runs (traces) Input Tokens Output Tokens Total Tokens
Daily Ambient Context Optimizer 3 7,378,432 69,272 7,447,704
Smoke Codex 2 5,063,812 29,600 5,093,412
Daily CLI Tools Exploratory Tester 1 2,546,859 17,135 2,563,994
Ubuntu Actions Image Analyzer 1 2,521,578 19,479 2,541,057
Code Simplifier 1 2,510,812 21,646 2,532,458
Workflow Skill Extractor 1 2,515,889 13,320 2,529,209
Daily Firewall Logs Collector and Reporter 1 2,473,342 37,006 2,510,348
UK AI Operational Resilience 1 2,400,766 48,018 2,448,784
Package Specification Extractor 1 1,929,515 10,195 1,939,710
Per-trace evidence (12 heaviest traces, fully confirmed)

Each row is one workflow run (trace), with token sums from the spans dataset (has:gen_ai.usage.total_tokens) and the workflow name resolved from that trace's setup span. The 12 traces total ~29.6M tokens (~24.5% of the 24h total); the remainder is spread across ~100+ lighter traces that were not individually attributed.

Trace Workflow Calls Input Output Total
9dde8a13... Smoke Codex 46 2,860,649 18,025 2,878,674
a97c3ad4... Daily CLI Tools Exploratory Tester 55 2,546,859 17,135 2,563,994
2172db96... Ubuntu Actions Image Analyzer 64 2,521,578 19,479 2,541,057
78992741... Code Simplifier 62 2,510,812 21,646 2,532,458
e719065f... Workflow Skill Extractor 49 2,515,889 13,320 2,529,209
365b9906... Daily Ambient Context Optimizer 51 2,504,399 22,853 2,527,252
4927d2ab... Daily Firewall Logs Collector and Reporter 51 2,473,342 37,006 2,510,348
90dcd657... Daily Ambient Context Optimizer 60 2,444,318 20,148 2,464,466
42fbd029... Daily Ambient Context Optimizer 57 2,429,715 26,271 2,455,986
c7e9cfec... UK AI Operational Resilience 48 2,400,766 48,018 2,448,784
0a1cbc61... Smoke Codex 36 2,203,163 11,575 2,214,738
4201497c... Package Specification Extractor 36 1,929,515 10,195 1,939,710

Verification: trace 9dde8a133d9e4fd64d365fac19169092 was confirmed end-to-end — it contains 46 token-bearing http.client spans plus a gen_ai setup span carrying gh-aw.workflow.name = "Smoke Codex", validating the trace-join continuity.

Workflow activity inventory (by run count, from setup spans)

Highest-frequency workflows by unique traces in 24h (these carry gh-aw.workflow.name but not token data — they cannot be summed for tokens directly):

  • Smoke CI — 109 runs
  • Auto-Triage Issues — 23
  • AI Moderator — 21
  • PR Sous Chef — 18
  • Test Quality Sentinel — 15 · PR Code Quality Reviewer — 15 · Issue Monster — 15 · Matt Pocock Skills Reviewer — 15 · Design Decision Gate 🏗️ — 15
  • Smoke Copilot — 9 · Daily Ambient Context Optimizer — 9
  • (≥50 distinct workflows total; list capped at 50)

Note the divergence: high frequency (e.g. Smoke CI, 109 runs) does not imply high token consumption — the heaviest traces belong to lower-frequency, high-context workflows. Smoke CI's aggregate could not be confirmed because its per-run token spans are not among the heavy traces.

Data Quality and Gaps
  • Workflow identifier missing on 100% of token spans. gen_ai.usage.* is emitted on span.op:http.client spans where gh-aw.workflow.name, transaction, and gen_ai.request.model are all null. Workflow attribution required a manual trace-join (group token spans by trace, then resolve the workflow from each trace's gen_ai setup span).
  • search_events / Seer unavailable in this MCP build (no embedded LLM provider). All queries used list_events with direct Sentry query syntax against dataset:spans.
  • Model + run_id not attributable on token spans: gen_ai.request.model is null and github.run_id returned null when grouping token/setup spans, so cost-by-model and run-level rollups are not possible from these spans today.
  • Live-ingestion drift: the token-span count read as 5,581 then 5,588 across consecutive queries; headline uses the latest (5,588). Sums (~120.6M) are from the snapshot and may move slightly.
  • Attribution coverage: only the 12 heaviest traces (~24.5% of total tokens) were individually attributed; full per-workflow totals would require resolving all ~100+ remaining traces.
  • Token precedence: all usable records carried gen_ai.usage.*; no ai.*/usage.*/prompt_tokens aliases were observed, so no double-count risk applied.

Recommendations

  1. Propagate workflow context onto the model-call spans (highest leverage). In actions/setup/js/send_otlp_span.cjs, add gh-aw.workflow.name (and ideally gen_ai.request.model + github.run_id) as attributes — or resource attributes — on the http.client spans that carry gen_ai.usage.*. This removes the need for trace-joins and unlocks one-query tokens-by-workflow and tokens-by-model reporting.
  2. Investigate Daily Ambient Context Optimizer — the top aggregate consumer (≥7.4M tokens across 3 runs, ~2.48M/run). Audit how much repo/context it loads per run; trim or cache the ambient context to cut input tokens.
  3. Target the input side, not output. With a ~41:1 input:output ratio, savings come from prompt/context reduction: prune system prompts, scope file/context reads, and enable prompt caching for repeated context across the high-context workflows (Code Simplifier, Ubuntu Actions Image Analyzer, Workflow Skill Extractor, UK AI Operational Resilience).
  4. Right-size smoke tests. Smoke Codex runs spent ~2.5M tokens each (top of the heavy-trace list). Confirm smoke workflows use minimal fixtures/prompts rather than full-context payloads.

References

Generated by 📊 Daily Token Consumption Report (Sentry OTel) · opus48 12.8M ·

  • expires on Jun 5, 2026, 12:54 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions