fix: track token usage in litellm non-streaming and async calls#4171
Closed
devin-ai-integration[bot] wants to merge 1 commit into
Closed
fix: track token usage in litellm non-streaming and async calls#4171devin-ai-integration[bot] wants to merge 1 commit into
devin-ai-integration[bot] wants to merge 1 commit into
Conversation
This fixes GitHub issue #4170 where token usage metrics were not being updated when using litellm with streaming responses and async calls. Changes: - Add token usage tracking to _handle_non_streaming_response - Add token usage tracking to _ahandle_non_streaming_response - Add token usage tracking to _ahandle_streaming_response - Fix sync streaming to track usage in both code paths - Convert usage objects to dicts before passing to _track_token_usage_internal - Add comprehensive tests for token usage tracking in all scenarios Co-Authored-By: João <joao@crewai.com>
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Contributor
Author
|
Closing due to inactivity for more than 7 days. Configure here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix: track token usage in litellm non-streaming and async calls
Summary
Fixes GitHub issue #4170 where
get_token_usage_summary()was not returning accurate metrics when using litellm with non-streaming responses and async calls.The root cause was that
_track_token_usage_internal()was only being called in the sync streaming code path. This PR adds token tracking to:_handle_non_streaming_response(sync non-streaming)_ahandle_non_streaming_response(async non-streaming)_ahandle_streaming_response(async streaming)Additionally, litellm returns usage as an object with attributes (e.g.,
usage.prompt_tokens), but_track_token_usage_internal()expects a dict. Added conversion logic to handle this.Review & Testing Checklist for Human
hasattr(usage_info, "__dict__")check: This distinguishes objects from dicts - verify this works correctly with all litellm response formatsRecommended Test Plan
LLM(model="gpt-4o-mini", is_litellm=True)with:stream=False+call()stream=False+acall()stream=True+acall()llm.get_token_usage_summary()returns non-zero valuesNotes