-
Notifications
You must be signed in to change notification settings - Fork 206
[AMD] Add MiniMax-M3-FP8 MI355X ATOMMESH #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
8052e68
84b978f
5ce4151
1b64366
5988840
2cf2a05
60d4d0f
58b0908
38cd2a8
67d19a7
e81a996
929f89f
7c779fd
279171b
dce66c8
94c2302
26655d0
e534d9d
78833bc
95f6ca4
b0d8f56
757ee4b
4a03b6f
086373f
92252fc
e7bba6f
04327df
c79cf92
7c5eee1
daf2666
634edc8
c7824d9
fa89765
b32b03e
4beb48d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2612,6 +2612,59 @@ minimaxm3-fp4-mi355x-atom: | |
| - { tp: 4, conc-start: 1, conc-end: 256 } | ||
| - { tp: 8, conc-start: 1, conc-end: 2 } | ||
|
|
||
| minimaxm3-fp8-mi355x-atom-disagg: | ||
| image: rocm/atom-dev:MiniMax-M3-20260622 | ||
| model: amd/MiniMax-M3-MXFP8 | ||
| model-prefix: minimaxm3 | ||
| runner: mi355x-disagg | ||
| precision: fp8 | ||
| framework: atom-disagg | ||
| multinode: true | ||
| disagg: true | ||
| scenarios: | ||
| fixed-seq-len: | ||
| - isl: 8192 | ||
| osl: 1024 | ||
| search-space: | ||
| # 1P1D TP4 | ||
| - conc-list: [ 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 ] | ||
| prefill: | ||
| num-worker: 1 | ||
| tp: 4 | ||
| ep: 1 | ||
| dp-attn: false | ||
| additional-settings: | ||
| - "PREFILL_NODES=1" | ||
| decode: | ||
| num-worker: 1 | ||
| tp: 4 | ||
| ep: 1 | ||
| dp-attn: false | ||
| additional-settings: | ||
|
Comment on lines
+2631
to
+2643
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seungrokj quick question out of curiousity: for TP4+TP4 is this over XGMI or RDMA?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @functionstackx this is over RDMA across 2 nodes.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seungrokj thanks for your insight! this is with the mooncake kvcache transfer engine right? |
||
| - "DECODE_NODES=1" | ||
| # 1P1D TP4 | ||
| - isl: 1024 | ||
| osl: 1024 | ||
| search-space: | ||
| # 1P1D TP4 | ||
| - conc-list: [ 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 ] | ||
| prefill: | ||
| num-worker: 1 | ||
| tp: 4 | ||
| ep: 1 | ||
| dp-attn: false | ||
| additional-settings: | ||
| - "PREFILL_NODES=1" | ||
| decode: | ||
| num-worker: 1 | ||
| tp: 4 | ||
| ep: 1 | ||
| dp-attn: false | ||
| additional-settings: | ||
| - "DECODE_NODES=1" | ||
|
|
||
| # MiniMax-M3 MXFP8 MI300X recipe. Use the TP8-only H100 search space: TP8 for | ||
| # latency and TP8+EP8 (TEP) at high concurrency. | ||
| # MiniMax-M3 MXFP8 MI300X day-zero recipe. Reuse the dedicated ROCm image and | ||
| # MI355X serving shape, but retain the default BF16 KV cache because this | ||
| # checkpoint lacks calibrated ROCm FP8 attention scales. Use the TP8-only H100 | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.