LLMSpeed - LLM Token Speed Testing Tool

Test the speed of OpenAI Chat and Anthropic format models, measuring first token latency and tokens per second.

Features

Support OpenAI Chat and Anthropic streaming APIs
Measure first token latency (ms) and tokens per second
Detect invalid models early via content-type check
Support reasoning_content field for reasoning models
Export results to CSV

Config File Format (config.yaml)

test_prompt: "Please introduce yourself"

models:
  - base_url: "https://api.openai.com/v1"
    type: "openai-chat"
    models:
      - "gpt-4o"
      - "gpt-4o-mini"
    api_key: "sk-xxx"

  - base_url: "https://api.anthropic.com"
    type: "anthropic"
    models:
      - "claude-3-5-sonnet-20241022"
    api_key: "sk-ant-xxx"

Run

pip install openai anthropic pyyaml
python llmspeed.py

Use a custom config file:

python llmspeed.py my_config.yaml

Output

Results are saved to results.csv with the following columns:

Column	Description
`base_url`	API endpoint
`type`	API type (`openai-chat` or `anthropic`)
`model`	Model name
`first_token_latency`	First token latency in ms (-1 if failed)
`tokens_per_second`	Tokens per second (-1 if failed)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
README_zh.md		README_zh.md
llmspeed.py		llmspeed.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMSpeed - LLM Token Speed Testing Tool

Features

Config File Format (config.yaml)

Run

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLMSpeed - LLM Token Speed Testing Tool

Features

Config File Format (config.yaml)

Run

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages