Skip to content

Refactor communication nodes and post methods into separate files#5970

Merged
Priya2698 merged 6 commits intomainfrom
pm/refactor_comm
Feb 18, 2026
Merged

Refactor communication nodes and post methods into separate files#5970
Priya2698 merged 6 commits intomainfrom
pm/refactor_comm

Conversation

@Priya2698
Copy link
Collaborator

No description provided.

@Priya2698
Copy link
Collaborator Author

!test

@github-actions
Copy link

github-actions bot commented Feb 18, 2026

Review updated until commit eac2df6

Description

  • Refactored communication post methods into separate files for better code organization

  • Created new post_communication.cpp and post_communication.h files containing all post communication logic

  • Simplified communication.cpp by removing post method implementations and helper functions

  • Updated all import statements across the codebase to use the new post_communication header

Changes walkthrough

Relevant files
Enhancement
11 files
communication.cpp
Remove post methods and simplify communication implementation
+2/-572 
post_communication.cpp
New file containing all post communication method implementations
+573/-0 
post_communication.h
New header file with post communication method declarations
+88/-0   
communication.h
Remove post method declarations and simplify communication interface
+1/-77   
evaluator.cpp
Update import to use post_communication header                     
+1/-1     
lower_to_communication.cpp
Update import to use post_communication header                     
+1/-1     
convert_op_to_communication.cpp
Update import to use post_communication header                     
+1/-1     
compiled_kernel.cpp
Update import to use post_communication header                     
+1/-1     
lower.h
Update import to use post_communication header                     
+1/-1     
lower_to_communication.h
Update import to use post_communication header                     
+1/-1     
executor.h
Update import to use post_communication header                     
+1/-1     
Tests
4 files
test_multidevice_communications.cpp
Update import to use post_communication header                     
+1/-1     
test_multidevice_dispatch_combine.cpp
Update import to use post_communication header                     
+1/-1     
test_multidevice_stream_parallel_type.cpp
Update import to use post_communication header                     
+1/-1     
multidevice.h
Update import to use post_communication header                     
+1/-1     
Configuration changes
1 files
CMakeLists.txt
Add new post_communication.cpp to build sources                   
+1/-0     

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 PR contains tests
⚡ Recommended focus areas for review
Method Declaration Mismatch

The PR removes the declaration of getRootRelativeIndex from communication.h but the implementation in communication.cpp still references it. The method was renamed to getRelativeIndex but the declaration wasn't updated to match.

int64_t getRelativeIndex(DeviceIdxType rank);
Incomplete Code Movement

The getRootRelativeIndex method implementation was removed but not moved to the new post_communication files. This method is still being called by post functions that were moved to post_communication.cpp.

int64_t Communication::getRelativeIndex(DeviceIdxType rank) {
  const Team& team = team();
  auto i = std::find(team.begin(), team.end(), rank);
  NVF_ERROR(i != team.end(), "Unable to find rank ", rank, " in team ", team);
  return std::distance(team.begin(), i);
}

@Priya2698 Priya2698 requested a review from wujingyue February 18, 2026 02:51
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 18, 2026

Greptile Summary

This PR extracts the postSingleCommunication functions (for both Communication and P2PCommunication) along with their helper functions (postBroadcast, postGather, postAllgather, postScatter, postReduce, postAllreduce, postReduceScatter, postSendRecv, postAllToAll, postSend, postRecv, and utilities like assertBufferCount, assertBuffersHaveSameSize, viewAsCompact, doLocalCopy, getInitialValue) from communication.cpp/.h into new files post_communication.cpp/.h.

Additionally, the free function getRelativeIndex + member function getRootRelativeIndex pair is consolidated into a single member function Communication::getRelativeIndex, and unused includes (type.h, visibility.h) are removed from communication.h.

  • The core refactoring (new post_communication.cpp/.h files, code removal from communication.cpp/.h) is clean and preserves behavior.
  • The default case in P2PCommunicationType operator<< was removed. Since P2PCommunicationType is an unscoped enum class with only two values (SEND, RECV), the compiler's -Wswitch warning covers exhaustiveness, so this is acceptable but removes an explicit error message for corrupt enum values.
  • A minor formatting change: the blank line between isReduction's closing brace and } // namespace in communication.cpp was removed.

Confidence Score: 4/5

  • This is a mechanical refactoring that moves code between files without logic changes; safe to merge after addressing existing review feedback.
  • The PR is a straightforward code extraction/refactoring. The moved code in post_communication.cpp is functionally identical to the original, with only the expected API rename from getRootRelativeIndex to getRelativeIndex. The removal of the default case in P2PCommunicationType operator<< is minor. Previous review threads have flagged the main concerns (unnecessary include widening, duplicated getRelativeIndex).
  • csrc/multidevice/communication.cpp (minor formatting change, removed blank line before namespace close)

Important Files Changed

Filename Overview
csrc/multidevice/post_communication.cpp New file containing all postSingleCommunication overloads and helper functions (postBroadcast, postGather, etc.) moved from communication.cpp. Code is a straightforward lift-and-shift with getRootRelativeIndex calls updated to getRelativeIndex.
csrc/multidevice/post_communication.h New header declaring postSingleCommunication overloads. Includes communication.h for transitive access to Communication and P2PCommunication types.
csrc/multidevice/communication.cpp Removed postSingleCommunication and all its helpers. Converted free-function getRelativeIndex to member function Communication::getRelativeIndex. Removed default case from P2PCommunicationType operator<< and blank line before } // namespace.
csrc/multidevice/communication.h Removed postSingleCommunication declarations and unused includes (type.h, visibility.h). Renamed getRootRelativeIndex to getRelativeIndex to reflect its broader usage.
CMakeLists.txt Added post_communication.cpp to the build sources list.
csrc/host_ir/evaluator.cpp Switched include from communication.h to post_communication.h. This file genuinely uses postSingleCommunication.
tests/cpp/test_multidevice_communications.cpp Switched include from communication.h to post_communication.h. Genuinely uses postSingleCommunication.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Before["Before: communication.h / communication.cpp"]
        CH_OLD["communication.h\n- Communication class\n- P2PCommunication class\n- MoeDispatch / MoeCombine\n- postSingleCommunication decls"]
        CC_OLD["communication.cpp\n- Communication impl\n- P2PCommunication impl\n- MoeDispatch / MoeCombine impl\n- getRelativeIndex (free fn)\n- getRootRelativeIndex (member)\n- postBroadcast, postGather, ...\n- postSingleCommunication impls"]
    end

    subgraph After["After: Split into two file pairs"]
        CH_NEW["communication.h\n- Communication class\n  (with getRelativeIndex member)\n- P2PCommunication class\n- MoeDispatch / MoeCombine"]
        CC_NEW["communication.cpp\n- Communication impl\n- P2PCommunication impl\n- MoeDispatch / MoeCombine impl\n- getRelativeIndex (member)"]
        PH_NEW["post_communication.h\n- #include communication.h\n- postSingleCommunication decls"]
        PC_NEW["post_communication.cpp\n- postBroadcast, postGather, ...\n- postSingleCommunication impls\n- Helper utilities"]
    end

    PH_NEW -->|includes| CH_NEW
    PC_NEW -->|includes| PH_NEW
Loading

Last reviewed commit: 12c70d4

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Collaborator

@wujingyue wujingyue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like separating them to different files but I'd do it slightly differently: communication_nodes.h => communication.h because of class Communication (https://google.github.io/styleguide/cppguide.html#File_Names); communication.h => post_communication.h

@Priya2698
Copy link
Collaborator Author

!test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@Priya2698
Copy link
Collaborator Author

!test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@Priya2698
Copy link
Collaborator Author

!test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

int64_t Communication::getRootRelativeIndex(DeviceIdxType root_val) {
return getRelativeIndex(team(), root_val);
int64_t Communication::getRelativeIndex(DeviceIdxType rank) {
const Team& team = team();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer getRelativeIndex(team, rank) than Communication::getRelativeIndex even though it's a bit more typing. Communication::getRelativeIndex doesn't use any private methods/fields of Communication, so making it a method of Communication is an overkill.

That said, I understood this PR didn't make anything worse. So LGTM as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will address it separately.

@Priya2698 Priya2698 merged commit 64e66ae into main Feb 18, 2026
53 of 55 checks passed
@Priya2698 Priya2698 deleted the pm/refactor_comm branch February 18, 2026 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants