Immiscible Diffusion: More aligned noise-image assignment#1260
Immiscible Diffusion: More aligned noise-image assignment#1260Koratahiu wants to merge 4 commits intoNerogar:masterfrom
Conversation
|
Interestingly this just came out very similar: https://junwankimm.github.io/CSFM/ |
|
this sounds similar to Minibatch Optimal Transport for flow models: https://arxiv.org/pdf/2302.00482 I have experimented with it in OneTrainer, but couldn't find a benefit for finetuning. I suspect this only helps the model converge early in pre-training. Minibatch Optimal Transport relies on a quite large batch (despite the name), so I also tried oversampling as your PR does, to make it work with small batch sizes. I couldn't find a benefit for that either in finetuning. Do you think my conclusions apply to this PR as well, or is my analogy not correct? |
This pull request implements the Immiscible Diffusion strategy, proposed by the paper:
"Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility"
This method optimizes the pairing between training data and noise, effectively "straightening" the diffusion trajectories. By ensuring each image is paired with noise that is mathematically close to it, the model learns a more direct mapping, leading to faster convergence and better sample quality.
The new "Immiscible Noise" setting (
Noise Oversampling) is available under the noise configuration settings.Noise Oversamplingsetting to 64, 128, 254, etc. (Higher values improve performance but may increase latency at larger batch sizes).The Key Difference: Random vs. Optimized Noise Pairing
Standard Diffusion (Random Pairing):
effective_noise = random_gaussian_noise()Immiscible Diffusion (Optimal/Nearest Pairing):
effective_noise = select_best_noise(source_tensor, candidates)Noise Oversampling Strategy
kcandidate noise tensors for every image in the batch.kcandidates.64.This implementation significantly reduces the complexity of the ODE/SDE trajectories that the model must learn. By making the diffusion paths "immiscible" (non-crossing), the model can achieve higher quality results in fewer training steps and produce sharper images during inference.
Implementation details
Sources
v1 paper: Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
v2 paper: Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility
Official repo: https://github.com/yhli123/Immiscible-Diffusion