Skip to content

Fix load_weights with strict=False to filter extra weights before update#3214

Merged
angeloskath merged 3 commits intoml-explore:mainfrom
gmin7:mdimarco/lazy_load_idx_fix
Mar 10, 2026
Merged

Fix load_weights with strict=False to filter extra weights before update#3214
angeloskath merged 3 commits intoml-explore:mainfrom
gmin7:mdimarco/lazy_load_idx_fix

Conversation

@gmin7
Copy link
Contributor

@gmin7 gmin7 commented Mar 6, 2026

Proposed changes

When loading weights from a checkpoint that contains more layers than the model (e.g., loading a full
model's safetensors into a model instantiated with num_hidden_layers=1), load_weights(..., strict=False)
raises an IndexError: list index out of range. This happens because indexed keys like layers.1.weight
pass through tree_unflatten and Module.update tries to index into the model's layers list at positions
that don't exist.

This restores the filtering of weight keys to only those present in the model's parameters when
strict=False, so extra weights are silently dropped before reaching update.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

Michelle DiMarco added 3 commits March 6, 2026 09:49
… the weights dict prunes out the unused weight keys to avoid idx error during tree_unflatten
… of bounds, and if so we ignore it on strict=False
Copy link
Member

@angeloskath angeloskath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch and fix!

@angeloskath angeloskath merged commit d2702a4 into ml-explore:main Mar 10, 2026
29 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants