Skip to content

startvibecoding/llmspeedtest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

LLMSpeed - LLM Token Speed Testing Tool

Test the speed of OpenAI Chat and Anthropic format models, measuring first token latency and tokens per second.

Features

  • Support OpenAI Chat and Anthropic streaming APIs
  • Measure first token latency (ms) and tokens per second
  • Detect invalid models early via content-type check
  • Support reasoning_content field for reasoning models
  • Export results to CSV

Config File Format (config.yaml)

test_prompt: "Please introduce yourself"

models:
  - base_url: "https://api.openai.com/v1"
    type: "openai-chat"
    models:
      - "gpt-4o"
      - "gpt-4o-mini"
    api_key: "sk-xxx"

  - base_url: "https://api.anthropic.com"
    type: "anthropic"
    models:
      - "claude-3-5-sonnet-20241022"
    api_key: "sk-ant-xxx"

Run

pip install openai anthropic pyyaml
python llmspeed.py

Use a custom config file:

python llmspeed.py my_config.yaml

Output

Results are saved to results.csv with the following columns:

Column Description
base_url API endpoint
type API type (openai-chat or anthropic)
model Model name
first_token_latency First token latency in ms (-1 if failed)
tokens_per_second Tokens per second (-1 if failed)

About

Test the speed of OpenAI Chat and Anthropic format models, measuring first token latency and tokens per second.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages