Skip to content
Merged
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
b506cd4
[NV] Add MiniMax M3 B300 Dynamo vLLM recipes
Oseltamivir Jun 19, 2026
84a023a
chore: update MiniMax M3 B300 container
Oseltamivir Jun 19, 2026
b09bc78
chore: update changelog PR link
Oseltamivir Jun 19, 2026
86da150
Update perf-changelog.yaml
Oseltamivir Jun 19, 2026
f5727c2
Update perf-changelog.yaml
Oseltamivir Jun 19, 2026
3b6dad4
fix(vllm): patch MiniMax M3 MSA contiguity
Oseltamivir Jun 19, 2026
71ba2ea
fix(recipes): align MiniMax M3 parallel settings
Oseltamivir Jun 19, 2026
b859a0b
fix(vllm): backport MiniMax M3 eval fixes
Oseltamivir Jun 19, 2026
2d408e4
ci(sweep): enable full MiniMax M3 validation
Oseltamivir Jun 19, 2026
3956aee
perf(vllm): right-size MiniMax M3 low concurrency
Oseltamivir Jun 20, 2026
33fe6a9
Merge remote-tracking branch 'origin/main' into pr-1787-latest
Oseltamivir Jun 20, 2026
77c6391
Merge branch 'main' into pr-1787-latest
Oseltamivir Jun 20, 2026
b99d3c9
perf(vllm): colocate MiniMax M3 TP4 workers
Oseltamivir Jun 20, 2026
d2347aa
fix(runner): exclude faulty B300 RDMA node
Oseltamivir Jun 20, 2026
8ace2e9
fix(runner): verify B300 node exclusion
Oseltamivir Jun 20, 2026
884ff12
fix(runner): check generated B300 sbatch script
Oseltamivir Jun 20, 2026
3ae240b
ci(sweep): validate B300 node exclusion
Oseltamivir Jun 20, 2026
9751d93
Merge remote-tracking branch 'origin/main' into pr-1787-latest
Oseltamivir Jun 20, 2026
03d27e7
refactor(vllm): trim MiniMax M3 runtime patches
Oseltamivir Jun 21, 2026
826a64e
Merge branch 'main' into pr-1787-latest
Oseltamivir Jun 22, 2026
aec850f
Merge branch 'main' into pr-1787-latest
Oseltamivir Jun 22, 2026
c535376
chore: prepare PR 1863 ingest recovery
Oseltamivir Jun 23, 2026
ef06e82
fix: recover PR 1863 ingest
Oseltamivir Jun 23, 2026
b691c61
chore: attach reusable sweep run 27951287552
Oseltamivir Jun 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4086,3 +4086,16 @@
- "Image: lmsysorg/sglang:nightly-dev-cu13-20260608-303757cc"
- "6 topologies across 1k/1k and 8k/1k: 1P1D TP4 STP + wide-EP (DEP4 prefill / DEP16 decode) from 1P1D up to 8P1D, recipes under benchmarks/multi_node/srt-slurm-recipes/sglang/qwen3.5/gb200-fp8/"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1810

- config-keys:
- minimaxm3-fp8-b300-dynamo-vllm
description:
- "Add MiniMax-M3 MXFP8 B300 disaggregated vLLM benchmarks via Dynamo for 1k1k and 8k1k STP."
- "Add local srt-slurm recipes under benchmarks/multi_node/srt-slurm-recipes/vllm/minimax-m3/b300-fp8 and wire the B300 launcher to overlay them into srt-slurm."
- "Add right-sized TP4 decode variants with expert parallelism disabled and the Marlin MoE backend for selected low-concurrency 1k1k and 8k1k shapes."
- "Colocate the six-GPU TP4 prefill/decode pairs on one B300 node and enable CUDA IPC for NIXL KV transfer."
- "Patch the 0618 image's MiniMax M3 MSA prefill top-k slice to be contiguous before CSR construction."
- "Align 8k1k expert-parallel settings with the 1k1k recipes and correct the decode CUDA graph capture limit."
- "Backport NVIDIA/srt-slurm#38 to sanitize Slurm node-IP discovery output on the pinned submission branch."
- "Backport vllm-project/vllm#45879 so NIXL validates heterogeneous-TP KV block lengths using the GQA KV-head ratio."
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1893