Skip to content

Add SGLang Ray Direct Transport (RDT) weight sync example#42

Open
xyuzh wants to merge 1 commit intomainfrom
sglang-ray-rdt-weight-sync
Open

Add SGLang Ray Direct Transport (RDT) weight sync example#42
xyuzh wants to merge 1 commit intomainfrom
sglang-ray-rdt-weight-sync

Conversation

@xyuzh
Copy link
Contributor

@xyuzh xyuzh commented Feb 17, 2026

Summary

  • Adds a new ray_rdt/ example that demonstrates transferring model weights from a HuggingFace trainer actor to an SGLang SchedulerActor using Ray Direct Transport (RDT) with NCCL
  • Includes a Dockerfile (based on anyscale/ray:2.53.0-py312-cu129 with CUDA 12.9 and SGLang installed), an Anyscale job YAML, and a test script that verifies all parameters match after transfer
  • Requires 2 GPUs on a single node (e.g., g5.12xlarge)

Test plan

  • Submit the job with anyscale job submit -f ray_rdt/job_test_rdt_weight_sync.yaml
  • Verify the test prints "SUCCESS: All parameters match!" with 0 mismatches

Made with Cursor

Demonstrates transferring model weights from a HuggingFace trainer actor
to an SGLang SchedulerActor using Ray Direct Transport with NCCL, then
verifying parameter correctness.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant