Immiscible Diffusion: More aligned noise-image assignment by Koratahiu · Pull Request #1260 · Nerogar/OneTrainer

Koratahiu · 2026-01-14T19:15:00Z

This pull request implements the Immiscible Diffusion strategy, proposed by the paper:
"Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility"

This method optimizes the pairing between training data and noise, effectively "straightening" the diffusion trajectories. By ensuring each image is paired with noise that is mathematically close to it, the model learns a more direct mapping, leading to faster convergence and better sample quality.

The new "Immiscible Noise" setting (Noise Oversampling) is available under the noise configuration settings.

Set Noise Oversampling setting to 64, 128, 254, etc. (Higher values improve performance but may increase latency at larger batch sizes).

The Key Difference: Random vs. Optimized Noise Pairing

Standard Diffusion (Random Pairing):

effective_noise = random_gaussian_noise()
Each data point in a batch is assigned a random piece of noise. Because the noise is chosen arbitrarily, the "path" from the clean image to the noisy state is often tangled and complex.

Immiscible Diffusion (Optimal/Nearest Pairing):

effective_noise = select_best_noise(source_tensor, candidates)
Instead of random assignment, this method ensures that the noise is "assigned" to an image such that the distance between them is minimized.

Noise Oversampling Strategy

The model generates k candidate noise tensors for every image in the batch.
It calculates the Euclidean distance between the image and all k candidates.
It selects the nearest candidate (the one with the minimum distance) and discards the rest.
Recommended value: 64.

This implementation significantly reduces the complexity of the ODE/SDE trajectories that the model must learn. By making the diffusion paths "immiscible" (non-crossing), the model can achieve higher quality results in fewer training steps and produce sharper images during inference.

Implementation details

Minimal changes in modules\modelSetup\mixin\ModelSetupNoiseMixin.py
New file: modules\util\immiscible_diffusion.py with the paper logic.
Universal support for all models, and it also works for flow-matching models.

Sources

v1 paper: Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
v2 paper: Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility
Official repo: https://github.com/yhli123/Immiscible-Diffusion

betterftr · 2026-02-10T02:17:30Z

Interestingly this just came out very similar: https://junwankimm.github.io/CSFM/

dxqb · 2026-02-12T10:19:09Z

this sounds similar to Minibatch Optimal Transport for flow models: https://arxiv.org/pdf/2302.00482
it is a known and actually used technique for pre-training flow models (for example, we know Chroma used it).

I have experimented with it in OneTrainer, but couldn't find a benefit for finetuning. I suspect this only helps the model converge early in pre-training.

Minibatch Optimal Transport relies on a quite large batch (despite the name), so I also tried oversampling as your PR does, to make it work with small batch sizes. I couldn't find a benefit for that either in finetuning.

Do you think my conclusions apply to this PR as well, or is my analogy not correct?
If this is useful, does it work in FM models?
Have you done (or could you do) some tests and found it useful in (LoRA-)finetuning?

Koratahiu added 3 commits January 14, 2026 21:42

initial

dc0a739

change tooltip

12ba9c3

pre-commit

9424202

maedtb reviewed Jan 17, 2026

View reviewed changes

Comment thread modules/util/immiscible_diffusion.py Outdated

Remove self and use optimized torch.cdist for the L2 distance

4811fb9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Immiscible Diffusion: More aligned noise-image assignment#1260

Immiscible Diffusion: More aligned noise-image assignment#1260
Koratahiu wants to merge 4 commits intoNerogar:masterfrom
Koratahiu:Immiscible-Diffusion

Koratahiu commented Jan 14, 2026

Uh oh!

Uh oh!

betterftr commented Feb 10, 2026

Uh oh!

dxqb commented Feb 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

Koratahiu commented Jan 14, 2026

The Key Difference: Random vs. Optimized Noise Pairing

Standard Diffusion (Random Pairing):

Immiscible Diffusion (Optimal/Nearest Pairing):

Noise Oversampling Strategy

Implementation details

Sources

Uh oh!

Uh oh!

betterftr commented Feb 10, 2026

Uh oh!

dxqb commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dxqb commented Feb 12, 2026 •

edited

Loading