Skip to content

Deepcopy env_extras dicts for each sample when preparing generator input#1189

Merged
SumanthRH merged 2 commits intoNovaSky-AI:mainfrom
ebronstein:deepcopy_env_extras
Feb 26, 2026
Merged

Deepcopy env_extras dicts for each sample when preparing generator input#1189
SumanthRH merged 2 commits intoNovaSky-AI:mainfrom
ebronstein:deepcopy_env_extras

Conversation

@ebronstein
Copy link
Copy Markdown
Contributor

@ebronstein ebronstein commented Feb 20, 2026

When preparing GeneratorInput, the same env_extras is used for each sample per prompt. Since the dictionaries are not copied, this can lead to bugs where changing the dictionary for one sample changes them for all samples in that group (for the same prompt).

This PR makes a deep copy of the env_extras for each sample.


Open with Devin

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 1 additional finding.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request addresses a potential bug where env_extras dictionaries were not deep-copied, leading to unintended modifications across samples. The change correctly implements copy.deepcopy to ensure each sample receives an independent copy of env_extras, preventing side effects. This is a good fix for data integrity.

Comment thread skyrl/train/generators/utils.py Outdated
Comment on lines +383 to +386
env_extras = [
copy.deepcopy(prompt["env_extras"])
for prompt in prompts
for _ in range(n_samples_per_prompt)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change correctly uses copy.deepcopy to ensure that each env_extras dictionary is an independent copy. This prevents unintended modifications to other samples when one sample's env_extras is altered, addressing the bug described in the PR description. This is a critical fix for data integrity.

        copy.deepcopy(prompt["env_extras"])
        for prompt in prompts
        for _ in range(n_samples_per_prompt)

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

🐛 1 issue in files not directly in the diff

🐛 Deepcopy fix not applied to duplicate prepare_generator_input in skyrl-train package (skyrl-train/skyrl_train/generators/utils.py:382)

The env_extras shared-reference bug that this PR fixes in skyrl/train/generators/utils.py also exists in the duplicate file skyrl-train/skyrl_train/generators/utils.py, which was not updated. The skyrl-train copy is actively used by skyrl_train/trainer.py:211, skyrl_train/evaluate.py:64, skyrl_train/fully_async_trainer.py:542, and other callers.

Root Cause

The repository maintains two near-identical copies of prepare_generator_input: one in skyrl/train/generators/utils.py (fixed by this PR) and one in skyrl-train/skyrl_train/generators/utils.py (not fixed). The only differences between the files are import paths and this copy.deepcopy change.

At skyrl-train/skyrl_train/generators/utils.py:382, the code still reads:

env_extras = [prompt["env_extras"] for prompt in prompts for _ in range(n_samples_per_prompt)]

This means all n_samples_per_prompt entries for the same prompt share the same dictionary reference. If any downstream code mutates one sample's env_extras, it silently corrupts all other samples for that prompt.

Impact: Users of the skyrl-train package (which includes the main trainer, evaluator, and async trainer) still have the shared-reference bug that this PR intends to fix.

View 2 additional findings in Devin Review.

Open in Devin Review

@SumanthRH SumanthRH self-assigned this Feb 26, 2026
Copy link
Copy Markdown
Member

@SumanthRH SumanthRH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@SumanthRH SumanthRH merged commit 5acb087 into NovaSky-AI:main Feb 26, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants