DLSlime/docs/roadmap.md at main · DeepLink-org/DLSlime

Overview

Links

DLSlime is dedicated to supporting efficient transmission over a variety of different links, including but not limited to IBVerbs, CUDA IPC, TCP Socket, PCIE, NVShmem, Ascend (Direct), NVME-oF ...

Transfer Engine

DLSlime provides a flexible and efficient P2P Transfer Engine, enabling AI-workload-aware customized functions such as Prefill-Decode separation and checkpoint transmission.

Collective Ops

Referring to DeepEP, DLSlime provides a buffer-based collective communication library that achieves ultra-low latency and SM-free collective communications.

Torch Wrapper

To meet the heterogeneous requirements of SPMD programs such as heterogeneous pipeline parallel training, a Torch communication backend is provided.

Transfer Engine Roadmap

IBVerbs Transfer Engine
- ✅ SendRecv Endpoint
- ✅ RDMA Read/Write Endpoint
NVShmem
- ✅ NVShmem Context and Send/Recv Kernel
- ⚡ support NVShmem put and get wrapper
TCP Socket
- ✅ zmq bootstrap
- ⏳ TCP Socket transfer engine
CUDA IPC
- ✅ support CUDAIPC Read/Write Endpoint
PCIE
- ⏳ High performance Shared Memory transfer engine
- ⏳ High performance data offloading
Ascend
- ✅ Ascned direct transfer engine
NVME-oF
- 💭 Planning
UB Mesh
- 💭 Planning

Collective Ops

IBVerbs
- ✅ Send/Recv
- ⚡ M2N for attention-FFN disaggregation
- ⏳ AllGather
- ⏳ AllReduce
- ⏳ All2All
NVShmem
- ⏳ Send/Recv
- ✅ AllGather
- ⏳ AllReduce
- ⏳ All2All
CUDA IPC
- ✅ AllGather
- ⚡ High performance AllGather using CUDA Multi-Mem

Torch Wrapper

IBVerbs
- ✅ Send/Recv
- ⏳ AllGather
- ⏳ AllReduce
- ⏳ All2All

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overview

Links

Transfer Engine

Collective Ops

Torch Wrapper

Transfer Engine Roadmap

Collective Ops

Torch Wrapper

FilesExpand file tree

roadmap.md

Latest commit

History

roadmap.md

File metadata and controls

Overview

Links

Transfer Engine

Collective Ops

Torch Wrapper

Transfer Engine Roadmap

Collective Ops

Torch Wrapper