fix(ompi): actually call MPI_Finalize in FinalizeImpl#10
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ec2514ac5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!finalized) { | ||
| INFINI_CHECK_MPI(MPI_Finalize()); |
There was a problem hiding this comment.
Avoid finalizing MPI when this library did not initialize it
FinalizeImpl<BackendType::kOmpi>::Apply now always calls MPI_Finalize() whenever MPI is not yet finalized, but InitImpl explicitly skips MPI_Init_thread when MPI was already initialized by the host process (src/ompi/impl/init.h, if (!initialized)). In that common embedding scenario (framework/app owns MPI lifecycle), this change tears down global MPI ownership from inside InfiniCCL, so any subsequent MPI call in the host can fail or abort. Finalization should be gated by whether InfiniCCL performed initialization, not just by MPI_Finalized().
Useful? React with 👍 / 👎.
MPI_Finalize in FinalizeImpl
Summary
This PR fixes an issue in
FinalizeImpl<BackendType::kOmpi, ...>::Applywhere the MPI runtime was never actually shut down.
Previously, the implementation called
MPI_Finalized(&finalized)to querywhether MPI had already been finalized, but never invoked
MPI_Finalize()itself. As a result:
MPI_Init_threadallocations were not properly releasedstatus when the process tore down
Changes
src/ompi/impl/finalize.h, after queryingMPI_Finalized, invokeMPI_Finalize()when MPI has been initialized but not yet finalizedINFINI_CHECK_MPIerror-propagation pattern sothat a non-
MPI_SUCCESSreturn is reported through the standardOpenMPI check macro
nested call — the guard makes the call idempotent
Known Issues & Future Work
CommInitAllallocates aCommunicatorvianew, but thecurrent
CommDestroyonly clears the inner backend instance and doesnot
deletethe outer object. That separate lifetime issue is out ofscope for this PR.
Test Environment
-DWITH_NVIDIA=ON -DWITH_OMPI=ONVerified that the
AllReduceexample exits cleanly with no MPI shutdownwarnings; without this fix the same binary leaves MPI in an unfinalized
state at exit.
all_reduce.log
all_gather.log