GitHub - FreedomIntelligence/AccelMark: Open benchmark leaderboard for AI accelerators on LLM inference workloads. Run on your GPU, submit your result, get on the board. NVIDIA · AMD · Ascend · Apple Silicon

Open benchmark leaderboard for AI accelerators on LLM workloads.

→ Live Leaderboard · Contributing · Suites · Discussions · Development

From workload spec to published result — every row on the leaderboard carries its runner hash, environment fingerprint, and accuracy receipt.

Why AccelMark?

	The problem	AccelMark's answer
MLPerf	Rigorous but slow — only large vendors participate	Community runs often finish quickly (e.g. Suite A default ~11 min; Suite D default ~22 min; full all-scenarios run ~7 h)
Vendor whitepapers	Different setups make cross-vendor comparison impossible	Fixed schema + shared LoadGen = apples-to-apples
Most benchmarks	Cover only NVIDIA and only throughput	NVIDIA, AMD, Huawei Ascend, Apple Silicon — throughput, latency, scaling, quantization

Quick start

# 1. Clone and install
git clone https://github.com/FreedomIntelligence/AccelMark.git
cd AccelMark
pip install -e .                                              # installs framework dependencies (Python >=3.10 required)
pip install -r runners/nvidia_vllm_47f5d58e/requirements.txt # installs runner dependencies

# 2. One-time setup
cp configs/submitter.yaml.example configs/submitter.yaml
# Edit configs/submitter.yaml — add your name

# 3. Run the benchmark (~11 min on A100)
python run.py --runner nvidia_vllm_47f5d58e --suite suite_A

# 4. Submit your result — open a pull request:
#    git checkout -b submit/<your-hardware>
#    git add results/community/<run_name>/ && git commit -m "results: <hardware>"
#    gh pr create   # or open via the GitHub web UI
#
# <run_name> is the directory auto-created by run.py — it already contains
# your result.json and env_info.json; no manual file moves are needed.

See CONTRIBUTING.md for the full guide.

Suites

Suite	Model	Chips	Question answered	Primary metric
A	Llama-3-8B	1	How fast is this chip at inference?	Offline tokens/sec
B	Llama-3-70B	flexible	Can this chip serve large models?	Offline tokens/sec
C	Llama-3.1-8B	1	Quantization speed/quality tradeoff?	Speedup vs BF16
D	Llama-3.1-8B	1	How does this chip handle long-context (28K) inputs?	Offline tokens/sec
E	Llama-3-8B	1×/2×/4×/8×	How well does this chip scale?	Scaling efficiency
F	Qwen2.5-0.5B	1	How fast is this consumer/edge GPU?	Offline tokens/sec
G	Mixtral-8x7B-Instruct	≥2 (auto)	How efficiently does this chip handle sparse MoE inference?	Offline tokens/sec

Suites A, B, and D also include optional speculative decoding and/or burst load extra scenarios — see suites/README.md for per-suite details.

See suites/README.md for full specs, time budgets, SLA definitions, and metric descriptions.

Currently on the leaderboard

A snapshot of accelerators that have at least one submission on the leaderboard. Tile size is proportional to submission count; colour denotes vendor. See the live leaderboard for current rankings, per-suite breakdowns, and the underlying result.json files.

Supported platforms

Reference runners live under runners/ (see each folder’s meta.json). The table below is auto-generated from each runner's meta.json — never hand-edited. Add a runner, declare its suite_support in meta.json, and the matrix updates on its own.

Hardware	Runner folder	Framework	A	B	C	D	E	F	G
NVIDIA GPU	`nvidia_sglang_c43a8309`	SGLang	✓	✓	✓	✓	✓	✓	✓
NVIDIA GPU	`nvidia_vllm020_0f6c56e4`	vLLM	⋯	⋯	⋯	⋯	⋯	⋯	⋯
NVIDIA GPU	`nvidia_vllm_47f5d58e`	vLLM	✓	✓	✓	✓	✓	✓	✓
NVIDIA V100 (SM70)	`nvidia_onecat_vllm_12a253c2`	1Cat-vLLM	⋯	⋯	⋯	⋯	⋯	—	⋯
AMD GPU	`amd_vllm_rocm_6c18cd8f`	vLLM-ROCm	✓	✓	✓	✓	✓	✓	✓
Huawei Ascend NPU	`ascend_vllm_ascend_d4aa9fda`	vllm-ascend	✓	✓	✓	✓	✓	—	—
Apple Silicon	`apple_mlx_lm_9546b8b5`	mlx-lm	⋯	—	—	⋯	—	⋯	—
Google TPU	`google_vllm_tpu_68cc9ffa`	vllm-tpu	✓	—	—	✓	—	✓	—
Moore Threads GPU	`moorethreads_vllm_musa_f2f6f965`	vllm-musa	✓	⋯	⋯	⋯	⋯	✓	—

Legend: ✓ validated · ⋯ author-declared (not smoke-tested in this repo yet) · — unsupported.

Regenerate locally with python tools/generate_platforms_matrix.py. CI runs --check and fails the PR if the README and runner metadata disagree.

Other stacks (TensorRT-LLM, MindIE, mlx-lm, etc.) can be added as new runner folders; see the contributor guide.

Adding a new runner? See CONTRIBUTING.md#adding-a-new-runner. Adding a new accelerator family? See runners/README.md.

Leaderboard tiers

Tier	How	Where
community	Submitted by anyone via PR and passes CI validation	Community tab
verified	Independently reproduced on the same hardware/runner and matches the original within 5%	Main leaderboard

Community results are fully visible and comparable — they just haven't been independently reproduced yet. Anyone with the listed hardware can promote a community result to verified by submitting a reproduction PR.

Contributing

The most valuable contribution is running the benchmark on hardware not yet in the leaderboard.

Submit a result → Submitting a result
Add a new runner → Adding a new runner
Add a new accelerator family → Platform plug-in guide
Report a bug → Open an issue
Ask a question / share results → Discussions
Extend the leaderboard → Development guide

Optional: AccelMark also ships a small voice-driven launcher for the OpenClaw ecosystem — see openclaw_skill/. It's not required to run, contribute, or submit results.

Citation

If you use AccelMark results in research, please cite:

@misc{accelmark2026,
  title  = {Beyond NVIDIA! A  Multi-Regime Framework for Benchmarking Heterogeneous AI Accelerators},
  author = {Liang, Juhao and Zhang, Zhiyuan and Li, Siyu and Lin, Zhihang and Yu, Minchen and Zeng, Li and Chen, Zizhong and Sun, Ruoyu and Wang, Benyou},
  year   = {2026},
  url    = {https://github.com/FreedomIntelligence/AccelMark}
}

License

Apache 2.0 — see LICENSE. Submitted benchmark results are contributed under CC BY 4.0. Bundled third-party data (datasets, accuracy subsets) keeps its upstream license — see NOTICE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why AccelMark?

Quick start

Suites

Currently on the leaderboard

Supported platforms

Leaderboard tiers

Contributing

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
configs		configs
datasets		datasets
docs/assets		docs/assets
examples		examples
leaderboard		leaderboard
loadgen		loadgen
openclaw_skill		openclaw_skill
results		results
runners		runners
schema		schema
serve		serve
suites		suites
tools		tools
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
run.py		run.py

Folders and files

Latest commit

History

Repository files navigation

Why AccelMark?

Quick start

Suites

Currently on the leaderboard

Supported platforms

Leaderboard tiers

Contributing

Citation

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages