fix: correct image to image DDIM and TCD by wbruna · Pull Request #1410 · leejet/stable-diffusion.cpp

wbruna · 2026-04-11T19:47:44Z

This supersedes my initial attempt at #665 , and has a much better chance of actually being correct 🙂

In a nutshell, there are two issues with DDIM and TCD:

they assume the initial image is pure noise, and apply a noise scaling which happens to be inversely proportional to the initial sigma value. This factor gets very close to 1 for the typical initial sigma for text to image, but it throws lower noise levels (e.g. in low-strength image to image) completely out of scale;
they have their own internal sigma schedules, but we rely on the input sigmas vector to set the right values for image to image.

This series remove that initial noise scaling, adapt the samplers to follow the provided sigma vectors, and sets appropriate default sigma schedulers: simple for DDIM, and LCM for TCD. I've tried to keep each commit self-contained, to be easier to follow and test.

Fixes #663 .

We don't have the noise component isolated during img2img at the sampler's code, so move the initial scaling to the latent initialization code. Note that, for normal txt2img, this scale factor is very close to 1 due to the large initial sigma. But it gets larger for small sigmas, e.g. with low-strength i2i.

Also tweaks the criteria to check for the last iteration, to make a follow-up change easier to read.

Instead of the arbitrarily-chosen first alpha_cumprod (which is very close to 1), use a constant 1, which is the value actually used at the start of the cumulative product calculations. This also avoids the need to check for the last iteration when scaling x for the next loop.

Apart from the rounding criteria, Simple is equivalent to the hardcoded DDIM scheduler. So, drop the local compvis_sigmas and alphas_cumprod tables, and recover the alpha_cumprod values from the provided sigmas vector. This partially fixes DDIM behavior with image to image, since we rely on the input sigmas vector to provide the appropriate noise levels. It also allows combining DDIM with different schedulers.

LCM and TCD timesteps are identical, so adapt the TCD code to use the provided sigmas vector, keeping the internal tables to obtain the timesteps from the sigmas. As with DDIM, this partially fixes TCD for image to image operations. An alternative could be using the input sigmas to obtain the timestep and prev_timestep values, then obtain new sigma and alpha_cumprod values from the tables, keeping most of the code as-is. It'd be more complex, though: input sigmas that happened to be too close to one another could end up rounded to the same timestep value, and it's not clear how to best guard against that.

As explained in a previous commit, the initial noise scaling has very little effect for text to image; but for image to image, the lower the denoising strength, the stronger it gets, until the model isn't able to compensate for it. I've placed this change at the end of the series to make it easier to test the results.

wbruna added 7 commits April 11, 2026 16:11

refactor: move DDIM/TCD scaling to the end of the loop

6f7bfa3

Also tweaks the criteria to check for the last iteration, to make a follow-up change easier to read.

refactor: fold DDIM/TCD scaling into the x update calculations

3827d1b

wbruna mentioned this pull request Apr 11, 2026

fix: adjust timestep calculations for DDIM and TCD #665

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct image to image DDIM and TCD#1410

fix: correct image to image DDIM and TCD#1410
wbruna wants to merge 7 commits intoleejet:masterfrom
wbruna:sd_fix_ddim_tcd_i2i

wbruna commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wbruna commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant