Skip to content

Comments

Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge}#13066

Merged
yiyixuxu merged 39 commits intohuggingface:mainfrom
miguelmartin75:cosmos/transfer2.5
Feb 12, 2026
Merged

Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge}#13066
yiyixuxu merged 39 commits intohuggingface:mainfrom
miguelmartin75:cosmos/transfer2.5

Conversation

@miguelmartin75
Copy link
Contributor

@miguelmartin75 miguelmartin75 commented Feb 2, 2026

What does this PR do?

This PR introduces Cosmos Transfer2.5 inference pipeline, which extends the existing code in transformer_cosmos.py and introduces a new controlnet class for cosmos. The conversion script is updated to convert the checkpoints too.

I've intentionally split the controlnet from the base predict model to match the rest of the diffusers codebase. To do this, I have had to duplicate some layers/weights from the base model (relating to the patch & timestep embeddings), but I believe SD3 does this.

Similar to predict2.5, I have added documentation and unit tests.

Additional PRs will be submitted for the following features (in order of priority):

  1. Auto-regressive inference support, currently inference can only be applied to a fix number of frames. In cosmos-transfer2.5 AR inference is performed.
  2. Additional transfer2.5 variants:
    • multi-control (multiple controlnets at once)
    • auto/multiview
  3. Image reference

In addition, unfortunately, the guardrails safety model is too aggressive: it currently flags "not safe" for the examples we have on cosmos-transfer2.5 (e.g. edge example for 93 frames is flagged). This guardrail model needs to be updated, but this work is ~orthogonal of this PR.

Who can review?

Core library:

@miguelmartin75 miguelmartin75 changed the title Cosmos/transfer2.5 Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} Feb 2, 2026
Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! The overall structure looks good. I left some minor comments.

One question before I can review further: Are the base transformer weights the same across the different control variants?

This helps us understand whether splitting the controlnet from the transformer makes sense (i.e., can users mix and match?), and also helps me understand whether the controlnet is required for this pipeline etc

@miguelmartin75
Copy link
Contributor Author

miguelmartin75 commented Feb 6, 2026

Addressed your comment about transfer2_5_forward + updated the example code

Are the base transformer weights the same across the different control variants? ... can users mix and match?

Yes, mix & matching controlnets is possible, but only if an image context reference is not included(see here, including an image reference is not currently supported in this PR). Additionally, including multiple controlnets "multicontrol" will be possible (any base transformer can be used; cosmos-transfer2.5 always picks "edge"), but I will need to submit a separate PR for this. Note, multicontrol does not support an image reference.

To be more specific, the base transformer weights are almost the same. The difference lies in the weights of the cross attention layers for an image reference (see here), i.e. attn2 in diffusers-land for these layers for all blocks in the base transformer. Without an image reference, all base transformers are functionally same, in this case the img_context tensor is torch.zeros; I also qualitatively verified all pairs of base transformer + controlnet as a sanity check and it looks like they output the same results.

I will need to document this when I have a PR up for image reference feature, (3) in my description

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I left a few more comments

@miguelmartin75 miguelmartin75 force-pushed the cosmos/transfer2.5 branch 2 times, most recently from 442e8e4 to ddec8fb Compare February 11, 2026 23:55
Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu yiyixuxu merged commit a181616 into huggingface:main Feb 12, 2026
25 of 28 checks passed
@sayakpaul
Copy link
Member

@miguelmartin75 would it make sense to update https://huggingface.co/docs/diffusers/main/en/api/pipelines/cosmos docs as well with transfer?

@miguelmartin75
Copy link
Contributor Author

miguelmartin75 commented Feb 17, 2026

Yes, thanks for the call out on that @sayakpaul - maybe I can do this in #13114

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants