Use clean at::cuda::CUDAStream but not wrapped with std::optional#11
Open
youge325 wants to merge 5 commits intoPFCCLab:paddlefrom
Open
Use clean at::cuda::CUDAStream but not wrapped with std::optional#11youge325 wants to merge 5 commits intoPFCCLab:paddlefrom
at::cuda::CUDAStream but not wrapped with std::optional#11youge325 wants to merge 5 commits intoPFCCLab:paddlefrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to remove the std::optional wrapper around at::cuda::CUDAStream by constructing comm_stream directly (via initializer-list + lambda), and updating stream usages accordingly.
Changes:
- Replace
std::optional<at::cuda::CUDAStream> comm_streamwith a plainat::cuda::CUDAStream. - Initialize
device_idandcomm_streaminBuffer’s constructor initializer list. - Update call sites from
comm_stream.value()tocomm_stream.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 15 comments.
| File | Description |
|---|---|
csrc/deep_ep.hpp |
Changes Buffer to store comm_stream directly and adjusts accessors. |
csrc/deep_ep.cpp |
Moves stream/context setup into initializer list and updates many stream call sites. |
Comments suppressed due to low confidence (2)
csrc/deep_ep.cpp:1447
- This
internode::cached_notifycall passescomm_stream(at::cuda::CUDAStream) where the API expectscudaStream_t(seecsrc/kernels/api.cuh). Please passcomm_stream.stream()/get_comm_stream_raw()instead.
rank,
comm_stream,
config.get_rdma_buffer_size_hint(hidden_int4 * sizeof(int4), num_ranks),
num_nvl_bytes,
false,
low_latency_mode);
csrc/deep_ep.cpp:1127
internode::cached_notify’s signature takescudaStream_t stream(seecsrc/kernels/api.cuh), but this call passescomm_stream(at::cuda::CUDAStream). This should becomm_stream.stream()/get_comm_stream_raw().
barrier_signal_ptrs_gpu,
rank,
comm_stream,
config.get_rdma_buffer_size_hint(hidden_int4 * sizeof(int4), num_ranks),
num_nvl_bytes,
true,
low_latency_mode);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
at::cuda::CUDAStream but not wrapped with std::optionalat::cuda::CUDAStream but not wrapped with std::optional
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
之前因为
at::cuda::CUDAStream没有 operator= 重载导致无法直接赋值,因此使用了std::optional容器,实际上应该像原来一样使用初始化列表,通过 lambda 表达式返回一个用于构造函数重载的 Stream ,以此消除引入std::optional带来的 diff