Skip to content

Conversation

@prishajain1
Copy link
Collaborator

@prishajain1 prishajain1 commented Jan 13, 2026

This PR aims to add support for Img 2Vid for WAN 2.1 and WAN 2.2

New Files added:

  • wan_pipeline_i2v_2p1.py : New file for WAN 2.1 Img2Vid
  • wan_pipeline_i2v_2p2.py : New file for WAN 2.2 Img2Vid
  • wan_checkpointer_i2v_2p1.py : Checkpointer for WAN 2.1 I2V
  • wan_checkpointer_i2v_2p2.py : Checkpointer for WAN 2.2 I2V
  • base_wan_i2v_14b.yml : New config file for WAN 2.1 I2V
  • base_wan_i2v_27b.yml : New config file for WAN 2.2 I2V

Files modified:

  • generate_wan.py: modified to handle new I2V pipelines
  • wan_pipeline.py: modified to handle new I2V pipelines
  • wan_checkpointer.py: modified to handle new I2V pipelines
  • embeddings_flax.py: Adds a new class NNXWanImageEmbedding for transforming raw image embeddings into a format expected by the transformer
  • transformer_wan.py: it now has an image embedder, which is an instance of NNXWanImageEmbedding and is used to process image embeddings
  • attention_flax.py:
    • Handles attention masking explicitly for the image component
    • In class FlaxWanAttention, we add new image specific layers
    • Modified call function for the case when image_embeddings are provided.
  • wan_utils.py : Added necessary changes to load image specific keys
  • wan_checkpointer_test.py : added relevant tests for I2V

Sample runs:

Testing other pipelines:

Currently this implementation will support video generation for height = 480, width = 832, model: Wan-AI/Wan2.1-I2V-14B-480P-Diffusers, Wan-AI/Wan2.2-I2V-A14B-Diffusers.
Optimisations needed to support 720x1280 will follow in the upcoming PRs

@github-actions
Copy link

entrpn
entrpn previously approved these changes Jan 14, 2026
@entrpn entrpn merged commit 6e17c3e into main Jan 15, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants