feat: v2 overhaul — correctness fixes, self-conditioning/CFG, discrete mode, DPO, perf by devnull37 · Pull Request #18 · devnull37/dimba-lib-exp

devnull37 · 2026-05-27T15:03:48Z

Summary

Major correctness + research upgrade of DIMBA over the v1 concept paper. Validated end-to-end (compileall clean + 13/13 runtime smoke). 39 files, +7,326 / -701.

Correctness fixes

Conditioning leak removed — prompt is clean context (clean-prefix + pooled prompt), never the target; response-only loss. (This was present in the v1 paper.)
Real zero-terminal-SNR cosine schedule (Lin et al. 2023) — previously a docstring-only claim.
Bidirectional Mamba denoiser; genuine Mamba-2 kernel preference with graceful fallback.
SimpleMamba2 rewritten — stable negative-A, per-channel input (was collapsing the inner dim), no double norm/residual; underflow→NaN guard.
Correct x0-DDIM sampler (+ v-prediction); fixed FiLM identity-init, the 3-tuple forward(), get_model_config, and the denoise_step helper reference.

Research upgrades

Self-conditioning, classifier-free guidance, min-SNR-γ weighting, cross-entropy / rounding anchor.

New capabilities

Discrete / masked + hybrid diffusion (corruption.py, masked_sampling.py, DIMBA.predict_token_logits).
DPO / IPO / SimPO + diffusion-ELBO / VRPO surrogate; pluggable verifiable rewards for GRPO (token-overlap demoted to a warned legacy option).
Vectorized parallel selective-scan, torch.compile helper, MLX backend skeleton.
ELBO best-of-K reranking.

Infra & docs

GitHub Actions CI (py3.10/3.12, CPU torch, pytest + black + mypy), pre-commit, benchmark script, new tests.
README (What''s-New + corrected claims), docs/IMPROVEMENT_PLAN.md, docs/RESEARCH_DIRECTIONS.md, docs/OVERHAUL_STATUS.md, CHANGELOG.md.

Validation

python -m compileall clean across the package.
13/13 end-to-end runtime smoke: all model modes, sampling + CFG, masked hook, corruption, masked sampling, parallel-scan parity (9.5e-7).
CI re-runs the full pytest suite on clean Linux runners with working torch.

Notes

Built via 6 parallel agents + coupled core surgery; the agents cross-validated each other''s work.
scripts/train_interactive.py (in-progress WIP) intentionally excluded.
Follow-ups: first-class masked-mode training script + a [MASK] token; cross-attention conditioning; real speed/quality benchmarks once compute lands.

🤖 Generated with Claude Code

…DPO, perf Major correctness + research upgrade over the v1 concept paper. Correctness: - Remove prompt-conditioning leak (clean-prefix context + pooled prompt; response-only loss) - Implement real zero-terminal-SNR cosine schedule (Lin et al. 2023) - Bidirectional Mamba denoiser; prefer genuine Mamba-2 kernels with fallback - Rewrite SimpleMamba2 (stable negative-A, per-channel input, no double norm/residual) - Correct x0-DDIM sampler (+ v-prediction); fix FiLM init, 3-tuple forward, get_model_config - Fix denoise_step helper reference; guard SimpleMamba2 scan against underflow NaN Research upgrades: - Self-conditioning, classifier-free guidance, min-SNR weighting, cross-entropy/rounding anchor New capabilities: - Discrete/masked + hybrid diffusion (corruption, masked sampling, predict_token_logits) - DPO/IPO/SimPO + diffusion-ELBO/VRPO surrogate; pluggable verifiable rewards for GRPO - Vectorized parallel selective-scan, torch.compile helper, MLX backend skeleton - ELBO best-of-K reranking Infra/docs: GitHub Actions CI, pre-commit, benchmark, tests, CHANGELOG, README, IMPROVEMENT_PLAN, RESEARCH_DIRECTIONS, OVERHAUL_STATUS. Validated: compileall clean + 13/13 runtime smoke (venv python). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector · 2026-05-27T15:03:53Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Scale the encoded signal to ~unit variance before diffusion so the schedule's SNR is meaningful (a la Stable Diffusion's 0.18215). Embeddings initialized at std 0.02 against unit-variance noise were crushing the effective SNR at every timestep, which is the crux of why latent/continuous text diffusion is harder to train. - DIMBA.latent_scale folded into encode_latent/decode_latent (round-trips exactly); default 1/embed_init_std for the embedding path, 1.0 for the projector/VAE path. - DIMBA.calibrate_latent_scale(batch): measure the encoded-signal std and set the factor (recommended before training in latent/VAE mode). - Configurable TokenEmbedding init_std; latent_scale + embed_init_std in model config. - Tests + end-to-end smoke updated (14/14 OK). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the stale paper-era guide (which documented the conditioning leak and the MSE-only loss as *the* procedure) with an accurate v2 reference: current data flow, model API conventions (no leak, 3-tuple forward, latent_scale round-trip, calibrate), the three diffusion modes, training via compute_dimba_losses, inference, post-training, the torch-teardown / venv-python / MPS environment gotchas, the file map, and the current PR status + open follow-ups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

devnull37 and others added 2 commits May 27, 2026 19:13

devnull37 merged commit bb1bbba into main May 28, 2026
0 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: v2 overhaul — correctness fixes, self-conditioning/CFG, discrete mode, DPO, perf#18

feat: v2 overhaul — correctness fixes, self-conditioning/CFG, discrete mode, DPO, perf#18
devnull37 merged 3 commits into
mainfrom
feature/dimba-v2-overhaul

devnull37 commented May 27, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devnull37 commented May 27, 2026

Summary

Correctness fixes

Research upgrades

New capabilities

Infra & docs

Validation

Notes

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant