[Fix] propagate init_function errors from all MPI ranks in block allocation mode by jan-janssen · Pull Request #1023 · pyiron/executorlib

jan-janssen · 2026-06-19T04:29:40Z

Summary

In interactive_parallel.py, the init_function branch only reported errors from MPI rank 0; failures on non-zero ranks were silently swallowed, leaving those ranks with uninitialised memory. Subsequent function calls on the affected ranks would then receive wrong or missing kwargs injected from memory.
The fix gathers errors from all ranks via MPI.COMM_WORLD.gather before rank 0 sends the success/error response to the scheduler — mirroring how function-execution results are already gathered. This also acts as an implicit barrier so the scheduler cannot dispatch the next task until every rank has finished initialising.
Adds test_internal_memory_mpi to cover block_allocation + cores=2 + init_function, a combination that had zero test coverage.

Test plan

Existing test test_internal_memory (cores=1) continues to pass
New test test_internal_memory_mpi (cores=2) passes and verifies that both MPI ranks receive the memory value set by the init function
Full test suite green

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

Bug Fixes
- Enhanced error handling during initialization to capture errors across distributed execution ranks and provide clearer error reporting.
Tests
- Added test coverage for memory allocation behavior in parallel execution environments.

…cation mode In interactive_parallel.py the init branch was only propagating errors from rank 0; failures on non-zero ranks were silently swallowed, leaving those ranks with uninitialised memory. Subsequent function calls on the affected ranks would then receive wrong or missing kwargs. The fix mirrors the existing function-execution path: after each rank runs call_funct for init, all errors are gathered to rank 0 via MPI.COMM_WORLD.gather before the success/error response is sent back to the scheduler. This also acts as an implicit barrier so the scheduler cannot dispatch the next task until every rank has finished init. Adds a test (test_internal_memory_mpi) that exercises block allocation with cores=2 and an init_function – a combination that had zero coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-19T04:29:54Z

Warning

Review limit reached

@jan-janssen, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 25 minutes and 10 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 911eeadf-d364-4fb7-bc73-75d21bef6094

📥 Commits

Reviewing files that changed from the base of the PR and between b173e16 and 7f56240.

📒 Files selected for processing (1)

src/executorlib/backend/interactive_parallel.py

📝 Walkthrough

Walkthrough

The "init" handling in interactive_parallel.py is refactored to catch exceptions into init_error, gather that variable across all MPI ranks, and have rank zero select and forward the first non-None error via ZMQ. A new unit test (test_internal_memory_mpi) validates init_function memory sharing with cores=2 using mpi4py.

Changes

MPI Init Error Propagation

Layer / File(s)	Summary
Gather init errors across MPI ranks and report on rank zero `src/executorlib/backend/interactive_parallel.py`, `tests/unit/standalone/interactive/test_spawner.py`	The `"init"` path now catches exceptions into `init_error`, gathers it across ranks (or wraps in a list for single-rank runs), and rank zero picks the first non-`None` error to send via ZMQ and write the error file; the new `test_internal_memory_mpi` test, gated on `mpi4py`, verifies `init_function` with `cores=2` returns two matching NumPy arrays.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

pyiron/executorlib#804: Introduced the "fail safe init function" logic in the same "init" handling path of interactive_parallel.py that this PR extends with cross-rank error gathering.

Poem

🐇 Across the MPI ranks I hop,
Collecting errors, none shall drop.
Rank zero picks the first mistake,
And sends it back for the caller's sake.
No init error slips away—
The rabbit gathers them all today! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and specifically describes the main change: propagating init_function errors from all MPI ranks in block allocation mode, which matches the core objective of the PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/init-function-mpi-parallel-block-allocation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-06-19T04:33:29Z

Codecov Report

❌ Patch coverage is 0% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.84%. Comparing base (640e440) to head (7f56240).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/executorlib/backend/interactive_parallel.py	0.00%	17 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1023      +/-   ##
==========================================
- Coverage   94.24%   93.84%   -0.40%     
==========================================
  Files          39       39              
  Lines        2119     2128       +9     
==========================================
  Hits         1997     1997              
- Misses        122      131       +9

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…nts) Moves the init-function handling out of main() into a private helper so the statement count stays within the ruff/pylint PLR0915 limit of 50. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jan-janssen and others added 2 commits June 19, 2026 07:01

[Fix] extract _execute_init_dict to satisfy PLR0915 (too many stateme…

efe81db

…nts) Moves the init-function handling out of main() into a private helper so the statement count stays within the ruff/pylint PLR0915 limit of 50. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[Docs] add docstring to _execute_init_dict

7f56240

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jan-janssen marked this pull request as draft June 19, 2026 05:23

jan-janssen closed this Jun 19, 2026

jan-janssen deleted the fix/init-function-mpi-parallel-block-allocation branch June 19, 2026 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] propagate init_function errors from all MPI ranks in block allocation mode#1023

[Fix] propagate init_function errors from all MPI ranks in block allocation mode#1023
jan-janssen wants to merge 3 commits into
mainfrom
fix/init-function-mpi-parallel-block-allocation

jan-janssen commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jan-janssen commented Jun 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jan-janssen commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

codecov Bot commented Jun 19, 2026 •

edited

Loading