DSR1-FP4 MI355x SGLang: Add EP configurations by ppalanga · Pull Request #1811 · SemiAnalysisAI/InferenceX

ppalanga · 2026-06-17T15:19:43Z

This PR does two things for DSR1-FP4 SGLang for MI355x:

Adding EP configurations
Update the sglang image to 0.5.13

Note

Low Risk
Benchmark and CI config only; no production serving or security paths. Main operational impact is longer/more expensive GPU sweeps from higher concurrency and new EP matrix points.

Overview
Extends dsr1-fp4-mi355x-sglang in amd-master.yaml for DeepSeek-R1 FP4 on MI355X: SGLang image v0.5.12 → v0.5.13, fixed-seq-len concurrency sweeps conc-end 64 → 256 for TP-only points, and new expert-parallel search-space rows (TP4/EP4 and TP8/EP8) for both 1k/1k and 8k/1k scenarios.

dsr1_fp4_mi355x.sh now requires EP_SIZE, passes --ep-size when EP > 1, and raises --cuda-graph-max-bs from 128 → 512 to match the wider concurrency range.

Documents the change under dsr1-fp4-mi355x-sglang in perf-changelog.yaml.

^{Reviewed by Cursor Bugbot for commit 008eeaa. Bugbot is set up for automated code reviews on this repo. Configure here.}

Adding EP configurations

github-actions · 2026-06-17T15:19:56Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-06-17T15:19:56Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 016686f. Configure here.}

Update amd-master.yaml

016686f

Adding EP configurations

ppalanga requested a review from a team June 17, 2026 15:19

ppalanga requested review from 1am9trash, billishyahao, chunfangamd, seungrokj and yctseng0211 as code owners June 17, 2026 15:19

github-project-automation Bot added this to InferenceMAX Board Jun 17, 2026

cursor Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread .github/configs/amd-master.yaml

ppalanga added 2 commits June 17, 2026 08:24

Update dsr1_fp4_mi355x.sh

98e80e6

Update perf-changelog.yaml

008eeaa

ppalanga changed the title ~~Update amd-master.yaml~~ DSR1-FP4 MI355x SGLang: Add EP configurations Jun 17, 2026

ppalanga marked this pull request as draft June 17, 2026 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DSR1-FP4 MI355x SGLang: Add EP configurations#1811

DSR1-FP4 MI355x SGLang: Add EP configurations#1811
ppalanga wants to merge 3 commits into
mainfrom
ppalanga-dsr1-ep-config

ppalanga commented Jun 17, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ppalanga commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

github-actions Bot commented Jun 17, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ppalanga commented Jun 17, 2026 •

edited

Loading