Qwen-Image diffusers PTQ: FP8 / NVFP4 / NVFP4-SVDQuant HF checkpoints by jingyu-ml · Pull Request #1706 · NVIDIA/Model-Optimizer

jingyu-ml · 2026-06-12T23:18:26Z

What does this PR do?

Type of change: New feature

Adds Qwen-Image (Qwen/Qwen-Image, QwenImageTransformer2DModel) to the diffusers quantization example and exports HuggingFace checkpoints in three precisions — FP8, NVFP4, and NVFP4 + SVDQuant — through the unified HF export.

Registers --model qwen-image (lazy diffusers import; no trust_remote_code).
Transformer-block-range recipe: quantizes only the linears under transformer_blocks, keeping the first 2 / last 2 blocks (and everything outside transformer_blocks) in original precision. Applied before calibration so SVDQuant never mutates the excluded blocks. Expressed with the top-level enable QuantizerCfgEntry field (disable-all → re-enable transformer_blocks → disable first/last-N).
SVDQuant export (AWQ-style): promotes quantizer-owned tensors to clean module-level safetensors keys at export time — weight_quantizer.svdquant_lora_a/b → <module>.svdquant_lora_a/b and input_quantizer._pre_quant_scale → <module>.pre_quant_scale — with a documented NVFP4_SVD quantization_config (group_size, has_zero_point: false, pre_quant_scale: true, lora_rank). Core SVDQuant quantization code (modelopt/torch/quantization) is unchanged.
Shared export-path fixes (validated across SDXL / Flux / Wan2.2): lazy onnx_graphsurgeon import (only needed for --onnx-dir); single-file save for large transformers (the layerwise-metadata post-processing does not support sharded safetensors); and hide_quantizers_from_state_dict now strips quantizer state from all modules so norm-layer input quantizers no longer leak input_quantizer._amax.

Usage

python examples/diffusers/quantization/quantize.py \
    --model qwen-image --override-model-path <Qwen-Image> --model-dtype BFloat16 \
    --format fp4 --quant-algo svdquant --lowrank 32 \
    --calib-size 64 --n-steps 20 \
    --hf-ckpt-dir <out> --sanity-image-path <out>/sanity.png
# FP8:   --format fp8 --quant-algo max
# NVFP4: --format fp4 --quant-algo max

Testing

Focused unit + example tests pass on GB200 (sm_100): block-range recipe, NVFP4_SVD config schema, SVDQuant forward/fold (LoRA stays on weight_quantizer), Qwen dummy-input / strict-QKV-fusion / promotion, pipeline loading, and the diffusers HF-export test for Qwen FP8 / NVFP4 / SVDQuant.
Full tests/examples/diffusers/test_export_diffusers_hf_ckpt.py is green (SDXL, Flux, Qwen, Wan2.2) — confirms the shared export changes do not regress other models.
End-to-end on the real Qwen/Qwen-Image (~20B): all three formats export valid HF checkpoints — only transformer_blocks 2..57 quantized, nothing outside, no quantizer/_amax leak, correct weight_scale(_2)/input_scale, promoted SVDQuant keys (rank-consistent shapes), and the expected quantization_config — plus a quantized-inference sanity image.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: ✅
Did you update Changelog?: ❌
Did you get Claude approval on this PR?: ❌

Additional Information

All changes are confined to the diffusers example (examples/diffusers/quantization) plus the shared export path (modelopt/torch/export); the core quantization library is untouched.

Follow-up (next step): fused-QKV SVDQuant for sglang / Nunchaku

This export keeps attention q/k/v (and add_q/k/v_proj) as separate projections — the diffusers-native layout. That matches sglang's bf16 / FP8 / plain-NVFP4 paths (which also keep QKV separate) and ModelOpt/TRT-LLM consumers, so those load 1:1.

sglang's NVFP4-SVDQuant (Nunchaku) path, however, builds a fused to_qkv with a single fused rank-r LoRA in Nunchaku-native format (proj_down/proj_up, smooth_factor, wscales/wtscale). Our per-projection tensors (svdquant_lora_a/b + pre_quant_scale; three independent rank-r decompositions) are not directly loadable there — and cannot be fused at load time, because the fp16 weight residual needed to derive a single fused rank-r is not preserved after export.

Planned next step: an opt-in fused-QKV SVDQuant export mode that fuses q/k/v before SVDQuant calibration (yielding one rank-r over the fused weight) and emits a Nunchaku-compatible layout, enabling lower-latency fused-QKV inference in sglang. Tracked as a separate follow-up.

Register Qwen/Qwen-Image as a supported model in the diffusers quantization example: - ModelType.QWEN_IMAGE and lazy-imported QwenImagePipeline (so the example still imports on older diffusers). - MODEL_REGISTRY / MODEL_PIPELINE / MODEL_DEFAULTS entries (backbone="transformer", text-to-image calibration dataset). - An actionable ImportError when the installed diffusers lacks Qwen classes, instead of an opaque failure. - filter_func_qwen_image: quantize only transformer_blocks, keeping the first two and last two of the 60 blocks (and everything outside transformer_blocks) in original precision. Enables the plain FP8/NVFP4 export path for Qwen-Image. Core SVDQuant code is unchanged. (Qwen-Image SVDQuant checkpoint work, RLCR round 0 / M1.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…harness Implements the Qwen-Image NVFP4/FP8/SVDQuant diffusers quantization feature (RLCR round 0 / M2-M5), keeping core SVDQuant code unchanged: M2 (recipe): build_block_range_quant_cfg() emits ordered quant_cfg rules (disable-all -> enable *.transformer_blocks.* -> disable first/last-N), applied pre-calibration in Quantizer.get_quant_config so SVDQuant never mutates the excluded blocks. Driven by a MODEL_DEFAULTS["block_range"] entry for Qwen-Image (exclude first 2 / last 2; n derived from the model; n>=first+last+1 enforced). M3 (export): _export_diffusers_checkpoint now promotes quantizer-owned tensors to clean module-level safetensors keys before hide_quantizers_from_state_dict (diffusers path only; the transformers path keeps its postprocess_state_dict rename): input_quantizer._pre_quant_scale -> <module>.pre_quant_scale (AWQ key), weight_quantizer.svdquant_lora_a/b -> <module>.svdquant_lora_a/b. Adds an NVFP4_SVD branch to convert_hf_config (modeled on nvfp4_awq: pre_quant_scale + lora_rank), and process_layer_quant_config now flags SVDQuant with pre_quant_scale=True. This also resolves the diffusers pre_quant_scale TODO for AWQ-style exports. M4 (tests): unit tests for the block-range recipe (first/last-2 exclusion, n>=6 validation) and the NVFP4_SVD HF config conversion. M5 (harness): quantize.py --sanity-image-path (in-memory quantized-inference image, pre-export) + examples/diffusers/quantization/qwen_image_svdquant/ {run_qwen_image_quantization.sh, README.md} (parameterized container/model/ export flow for FP8/NVFP4/SVDQuant). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

… tests Addresses the round-0 Codex review (RLCR round 1): Blocking fixes: - convert_hf_config: NVFP4_SVD config groups now keep `has_zero_point: False` (both convert_hf_quant_config_format and _quant_algo_to_group_config); asserted in the unit test. - build_block_range_quant_cfg: minimum is now first+last+2 (>=2 quantized middle blocks; n>=6 for the 2+2 Qwen recipe); recipe test rejects 5/4/3-block models. - quantize.py --sanity-image-path failures are now fatal (re-raise -> non-zero exit) so the harness cannot report success without the image; the harness also verifies sanity.png + safetensors + config.json exist per format. Qwen export enablement: - diffusers_utils.generate_diffusion_dummy_inputs: add a QwenImageTransformer2DModel branch (packed latents [B,(H//2)(W//2),C], encoder_hidden_states_mask, img_shapes, txt_seq_lens, optional guidance, continuous timestep). - unified_export_hf._fuse_qkv_linears_diffusion gains strict=; Qwen QKV fusion now fails hard instead of silently skipping. Promotion buffers now overwrite on re-export. create_pipeline_from gives the same actionable Qwen import error. Tests: - New tests/unit/torch/quantization/test_svdquant_forward_fold.py: LoRA stays on weight_quantizer, forward includes a nonzero residual, fold_weight folds it and drops the buffers (existing test_svdquant_lora_weights left unmodified). Deferred to Round 2 / cluster: tiny Qwen2_5_VL fixture + full diffusers e2e export test (needs a Qwen-capable diffusers + GPU); the actual AC-7 checkpoint run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…minology Round 2 (addresses round-1 Codex review: the round-1 code had no direct test coverage). Adds tests/unit/torch/export/test_diffusers_qwen_export.py: - Qwen dummy inputs: generate_diffusion_dummy_inputs builds the expected keys for a real tiny QwenImageTransformer2DModel, and the generated dummy forward runs on it (this is what catches any wrong shape/kwarg in the dummy-input builder). - Strict fusion: _fuse_qkv_linears_diffusion(strict=True) re-raises on a failing dummy forward; strict=False does not. - Structural export: _promote_quantizer_tensors_to_module promotes SVDQuant LoRA + pre_quant_scale to clean module keys that survive hide_quantizers_from_state_dict (promoted <module>.svdquant_lora_a/b + <module>.pre_quant_scale present; weight_quantizer / input_quantizer keys absent), on a calibrated tiny SVDQuant MLP. Also removes plan/workflow terminology (DEC-5, "pre-calibration") from source and test comments per the plan code-style note. Still pending (Round 3 / cluster): the full tiny Qwen pipeline fixture + e2e subprocess export test (needs diffusers' tokenizer/text-encoder construction and a GPU) and the AC-7 cluster run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Round 3 (addresses round-2 Codex review): - Fix the tiny Qwen-Image pipeline fixture (tests/_test_utils/torch/diffusers_models.py): build the Qwen2.5-VL text encoder inline from a tiny Qwen2_5_VLConfig (no Hub model load; the previous hf-internal-testing/...Qwen2_5_VL id does not exist), load the tokenizer from the tiny ...Qwen2VL id diffusers' own fast test uses, build the transformer with num_layers=6 (so the corrected first-2/last-2 block-range recipe, which needs >=6 blocks, is valid) and joint_attention_dim=16 matching the text encoder hidden_size, and a z_dim=4 VAE. Mirrors diffusers' QwenImagePipelineFastTests.get_dummy_components. - Add Qwen FP8 / NVFP4 / NVFP4-SVDQuant cases to test_export_diffusers_hf_ckpt.py using the tiny fixture. The test opens transformer/config.json and the exported safetensors and asserts: quant_method=modelopt; no weight_quantizer / input_quantizer._amax keys; for SVDQuant, promoted <module>.svdquant_lora_a/b + <module>.pre_quant_scale keys, config group pre_quant_scale/has_zero_point/ lora_rank, and non-empty ignore (excluded blocks); for plain formats, weight_scale. GPU/diffusers skip-guarded. - Drop remaining workflow terminology (Step 4.5, before-calibration) from the comments I introduced. Still cluster-only (no GPU here): executing these tests and the AC-7 harness run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…ep comments Round 4 (addresses round-3 Codex review): - Offline tiny Qwen tokenizer: _build_local_qwen2_tokenizer builds a deterministic byte-level Qwen2 tokenizer locally (GPT-2 byte->unicode vocab + Qwen specials, empty merges) instead of a Hub load; removes the tokenizer-unavailable skip path. - Strengthen test_qwen_image_hf_ckpt_export: assert equal module-prefix sets for .svdquant_lora_a/.svdquant_lora_b/.pre_quant_scale; promoted linears are a subset of weight-scaled linears; only the middle blocks {2,3} of 6 are quantized (first-2/ last-2 excluded); lora_a=[rank,in]/lora_b=[out,rank] with rank == --lowrank (8); NVFP4 weight_scale_2 present; exact config (quant_algo=NVFP4_SVD, lora_rank=8, pre_quant_scale=True, has_zero_point=False, non-empty ignore). - Remove the remaining "Step N:" workflow comments from unified_export_hf.py (the round-3 "grep clean" claim was wrong; verified clean across the whole file). Still cluster-only (no GPU/torch/diffusers here): executing these tests and the AC-7 harness run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…port test Round 5 (addresses round-4 Codex review, which found a regression I introduced): - The round-4 edit inserted the _module_prefixes/_block_indices helpers between @pytest.mark.parametrize("qwen_model", ...) and test_qwen_image_hf_ckpt_export, so the decorator was attached to the helper and the test would request an undefined qwen_model fixture. Moved the helpers/constants above the decorator so it directly decorates the test (verified via ast: the test now carries the qwen_model parametrization and the helper is undecorated). - Tightened SVDQuant assertions: require a_prefixes == b_prefixes == pqs_prefixes == weight_scale_prefixes (every quantized linear is promoted, no gaps), and assert every quantized prefix is under transformer_blocks (nothing outside is quantized), in addition to the {2,3}-only block check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

Round 6 (round-5 review found no code blocker; only the queued docstring nit): the create_tiny_qwen_image_pipeline_dir docstring still said the tokenizer was fetched from the Hub, but Round 4 switched it to a local offline build (_build_local_qwen2_tokenizer). Updated the wording to "fully offline". Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…tale docs Round 7 (addresses round-6 Codex review's two missing-coverage items): - AC-2.2 SVDQuant immutability test (test_qwen_block_range_recipe.py): builds a 6-block backbone, snapshots the excluded first/last block linear weights, runs SVDQuant via build_block_range_quant_cfg, and asserts the excluded blocks' weights are bit-identical (never calibrated) with no LoRA, while the middle blocks {2,3} receive LoRA and have their weights modified. - AC-1 negative-loading tests (new test_qwen_pipeline_loading.py): monkeypatch MODEL_PIPELINE[QWEN_IMAGE]=None and assert the actionable ImportError; a fake pipeline asserts create_pipeline does not pass trust_remote_code. Stale-doc cleanups: the resolved pre_quant_scale TODO wording in unified_export_hf.py; the build_block_range_quant_cfg docstring (first+last+1 -> +2); the conftest "SKETCH" wording (the fixture is now a working offline build). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…e-gate Round 8 (addresses round-7 Codex review, which verified against the diffusers source that QwenImageTransformer2DModel.forward has no txt_seq_lens parameter): - _qwen_inputs no longer passes txt_seq_lens (the real forward signature is hidden_states, encoder_hidden_states, encoder_hidden_states_mask, timestep, img_shapes, guidance, return_dict). Passing txt_seq_lens would have raised an unexpected-keyword error and, because Qwen export uses strict QKV fusion, hard-failed the export. - Signature-gate the dummy inputs: filter to the kwargs the installed model's forward actually accepts (via inspect.signature), so diffusers-version drift cannot hard-fail strict fusion either. - Update test_diffusers_qwen_export.py: no longer require txt_seq_lens. - Remove AC- plan terminology from two test docstrings (code-style note). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…after export Round 9 (clears the last queued code item from Codex; no code blockers remain): _promote_quantizer_tensors_to_module left the temporary <module>.svdquant_lora_a/b + <module>.pre_quant_scale buffers on the live module after export. Add _remove_promoted_quantizer_tensors and call it after each quantized diffusers component is saved, so the live module is unchanged post-export (repeated export / module reuse stay correct). The quantizer-owned tensors are untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…dquant) Validated end-to-end on GB200 against the real Qwen/Qwen-Image: all three formats export correct HF checkpoints (only transformer_blocks 2..57; nothing outside), no quantizer-state leak, and the focused tests pass. - models_utils: build_block_range_quant_cfg now uses the top-level enable QuantizerCfgEntry field (a None cfg retains the base preset's params) instead of nesting cfg.enable, which the QuantizerAttributeConfig validator rejects/mis-applies (the old form left every block quantized). - quantize.py: import onnx_utils.export lazily (only needed for --onnx-dir; avoids a hard onnx_graphsurgeon dependency), and pass max_shard_size so the ~20B transformer saves as a single safetensors -- the unified export's layerwise-metadata post-processing does not support sharded files. - diffusers_utils: hide_quantizers_from_state_dict strips quantizer submodules from all modules, not only is_quantlinear, so enabled input quantizers on norm layers no longer leak input_quantizer._amax into the checkpoint. - tests: the tiny QwenImageTransformer2DModel fixture signature-gates its kwargs (diffusers 0.38 removed pooled_projection_dim from the constructor); the recipe test asserts the corrected top-level enable schema. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

copy-pr-bot · 2026-06-12T23:18:29Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-12T23:18:38Z

📝 Walkthrough

Walkthrough

This PR extends the diffusers quantization example with Qwen-Image model support, including a selective block-range quantization strategy that excludes transformer edge blocks, SVDQuant low-rank export infrastructure, offline test utilities, and comprehensive validation tests.

Changes

Qwen-Image quantization harness

Layer / File(s)	Summary
Model registration and quantization filtering `examples/diffusers/quantization/models_utils.py`, `examples/diffusers/quantization/utils.py`	Adds `ModelType.QWEN_IMAGE`, wires it to `QwenImagePipeline`, registers default block-range quantization config excluding first/last transformer blocks, and implements `filter_func_qwen_image` to enforce this pattern during quantizer application.
Block-range quantization recipe `examples/diffusers/quantization/models_utils.py`	Implements `build_block_range_quant_cfg` to generate ordered quantization rules: disable all quantizers globally, re-enable under specified block module, then disable specific first/last blocks, with validation that minimum middle blocks remain quantized.
Quantization pipeline integration `examples/diffusers/quantization/quantize.py`	Integrates block-range recipe into `Quantizer.get_quant_config`, adds `--sanity-image-path` CLI option for post-quantization validation, sets `200GB` default shard size for transformer export, and lazy-imports optional ONNX dependencies.
Pipeline loading and error detection `examples/diffusers/quantization/pipeline_manager.py`	Raises targeted `ImportError` for missing `QwenImagePipeline` in both `create_pipeline_from` and `create_pipeline` paths, with guidance to upgrade diffusers.
SVDQuant algorithm and config conversion `modelopt/torch/export/convert_hf_config.py`, `modelopt/torch/export/quant_utils.py`	Adds `NVFP4_SVD` support to HuggingFace config conversion with `pre_quant_scale` enabled, `has_zero_point` disabled, and optional `lora_rank` injection from source config.
Export utilities for Qwen and SVDQuant `modelopt/torch/export/diffusers_utils.py`, `modelopt/torch/export/unified_export_hf.py`	Adds Qwen-specific dummy input builder with signature-based kwarg filtering, implements quantizer tensor promotion/removal to preserve SVDQuant LoRA buffers through Diffusers' state_dict hiding, broadens quantizer hiding to all modules, and adds strict QKV fusion mode for component-specific error handling.
Offline test fixture and utilities `tests/_test_utils/torch/diffusers_models.py`, `tests/examples/diffusers/conftest.py`	Builds offline Qwen2 tokenizer from synthetic `vocab.json`/`merges.txt`, rewrites `create_tiny_qwen_image_pipeline_dir` to construct text encoder and VAE from minimal configs without Hub access, and filters transformer constructor kwargs to avoid signature mismatches.
Checkpoint export and promotion tests `tests/examples/diffusers/test_export_diffusers_hf_ckpt.py`, `tests/unit/torch/export/test_export_diffusers.py`	Tests SVDQuant tensor promotion, verifies exported checkpoints contain only quantized transformer blocks with expected LoRA/scale tensors, asserts no quantizer state leakage, and validates config field consistency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

cherry-pick-0.45.0

Suggested reviewers

realAsma
cjluo-nv
sugunav14
meenchen

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.39% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Qwen-Image diffusers PTQ: FP8 / NVFP4 / NVFP4-SVDQuant HF checkpoints' accurately captures the main change: adding Qwen-Image model support to diffusers quantization with support for FP8, NVFP4, and SVDQuant export formats.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	PR diff contains no matches for torch.load weights_only=False, numpy.load allow_pickle=True, trust_remote_code=True, eval/exec, or '# nosec' in the changed files.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/qwen-image-svdquant-nvfp4

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

run_qwen_image_quantization.sh and its README are cluster-specific experiment/operator scripts (hard-coded /lustre paths) that do not belong in the upstream diffusers example. The feature itself (model registration, block-range recipe, FP8/NVFP4/SVDQuant export) is covered by the committed tests. The scripts are kept locally outside the repo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

github-actions · 2026-06-12T23:22:59Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1706/
Built to branch `gh-pages` at 2026-06-14 00:11 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

modelopt/torch/export/unified_export_hf.py (1)

1174-1221: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Always clean up promoted buffers with try/finally.

On Line 1174, promoted export buffers are added, but cleanup on Line 1219-1221 runs only on the success path. Any exception in save/postprocess/config update leaves the live component mutated (pre_quant_scale / svdquant_lora_* buffers lingering).

Proposed fix

-            _promote_quantizer_tensors_to_module(component)
-
-            # Build quantization config
-            quant_config = get_quant_config(component, is_modelopt_qlora=False)
-            if quant_config:
-                quantization_details = quant_config.get("quantization", {})
-                # Record the SVDQuant low-rank size so consumers know the LoRA shape.
-                if quantization_details.get("quant_algo") == "NVFP4_SVD":
-                    svdquant_rank = _detect_svdquant_rank(component)
-                    if svdquant_rank is not None:
-                        quantization_details["lora_rank"] = svdquant_rank
-            hf_quant_config = convert_hf_quant_config_format(quant_config) if quant_config else None
-
-            # Save the component
-            # - diffusers ModelMixin.save_pretrained does NOT accept state_dict parameter
-            # - for non-diffusers modules (e.g., LTX-2 transformer), fall back to torch.save
-            if hasattr(component, "save_pretrained"):
-                with hide_quantizers_from_state_dict(component):
-                    component.save_pretrained(component_export_dir, max_shard_size=max_shard_size)
-            else:
-                with hide_quantizers_from_state_dict(component):
-                    _save_component_state_dict_safetensors(component, component_export_dir)
-
-            # Post-process — merge, metadata, padding, swizzle
-            _postprocess_safetensors(
-                component_export_dir,
-                pipe,
-                hf_quant_config=hf_quant_config,
-                **kwargs,
-            )
-
-            # Update config.json with quantization info
-            if hf_quant_config is not None:
-                config_path = component_export_dir / "config.json"
-                if config_path.exists():
-                    with open(config_path) as file:
-                        config_data = json.load(file)
-                    config_data["quantization_config"] = hf_quant_config
-                    with open(config_path, "w") as file:
-                        json.dump(config_data, file, indent=4)
-
-            # Drop the temporary promoted export buffers so the live module is
-            # unchanged after export (supports repeated export / module reuse).
-            _remove_promoted_quantizer_tensors(component)
+            _promote_quantizer_tensors_to_module(component)
+            try:
+                # Build quantization config
+                quant_config = get_quant_config(component, is_modelopt_qlora=False)
+                if quant_config:
+                    quantization_details = quant_config.get("quantization", {})
+                    if quantization_details.get("quant_algo") == "NVFP4_SVD":
+                        svdquant_rank = _detect_svdquant_rank(component)
+                        if svdquant_rank is not None:
+                            quantization_details["lora_rank"] = svdquant_rank
+                hf_quant_config = convert_hf_quant_config_format(quant_config) if quant_config else None
+
+                if hasattr(component, "save_pretrained"):
+                    with hide_quantizers_from_state_dict(component):
+                        component.save_pretrained(component_export_dir, max_shard_size=max_shard_size)
+                else:
+                    with hide_quantizers_from_state_dict(component):
+                        _save_component_state_dict_safetensors(component, component_export_dir)
+
+                _postprocess_safetensors(
+                    component_export_dir,
+                    pipe,
+                    hf_quant_config=hf_quant_config,
+                    **kwargs,
+                )
+
+                if hf_quant_config is not None:
+                    config_path = component_export_dir / "config.json"
+                    if config_path.exists():
+                        with open(config_path) as file:
+                            config_data = json.load(file)
+                        config_data["quantization_config"] = hf_quant_config
+                        with open(config_path, "w") as file:
+                            json.dump(config_data, file, indent=4)
+            finally:
+                _remove_promoted_quantizer_tensors(component)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modelopt/torch/export/unified_export_hf.py` around lines 1174 - 1221, You
promote quantizer-owned tensors with _promote_quantizer_tensors_to_module but
only call _remove_promoted_quantizer_tensors on the success path, so exceptions
during save/postprocess/config update leave the module mutated; wrap the work
that occurs after promotion (the save path using hide_quantizers_from_state_dict
+ component.save_pretrained or _save_component_state_dict_safetensors,
_postprocess_safetensors, and the config.json update that uses hf_quant_config)
in a try/finally and call _remove_promoted_quantizer_tensors(component) in the
finally so cleanup always runs; preserve and re-raise any exception after
cleanup to avoid swallowing errors.

examples/diffusers/quantization/quantize.py (1)

111-121: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Make the Qwen block-range mask backbone-aware and use a single source of truth.

get_quant_config() always injects MODEL_DEFAULTS[QWEN_IMAGE]["block_range"], and quantize_model() always follows with get_model_filter_func(). For Qwen-Image that creates two concrete failure modes: --backbone transformer vae will raise when the VAE path hits build_block_range_quant_cfg() with no transformer_blocks, and any local override checkpoint whose transformer depth is not exactly 60 will calibrate with one exclusion mask but be post-disabled with the hard-coded 60-block mask from examples/diffusers/quantization/utils.py. Please gate the recipe/filter to the transformer backbone and derive both from the loaded backbone instead of keeping two independent masks.

Also applies to: 171-191, 223-233, 696-709
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/diffusers/quantization/quantize.py` around lines 111 - 121,
get_quant_config currently always injects
MODEL_DEFAULTS[QWEN_IMAGE]["block_range"] and
quantize_model/get_model_filter_func apply a separate hard-coded 60-block mask,
causing mismatch when backbone != transformer or transformer depth != 60; fix by
making the Qwen block-range mask backbone-aware and deriving it from the loaded
backbone (e.g., transformer_blocks, num_layers or backbone.config.*) as the
single source of truth: update get_quant_config to consult the actual backbone
type and depth and compute block_range via
build_block_range_quant_cfg(backbone_depth) instead of using
MODEL_DEFAULTS[QWEN_IMAGE]["block_range"], and update quantize_model and
get_model_filter_func to use that same computed mask (remove hard-coded masks in
examples/diffusers/quantization/utils.py) so all three locations
(get_quant_config, quantize_model, get_model_filter_func /
build_block_range_quant_cfg) reference the same backbone-derived value.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/diffusers/quantization/quantize.py`:
- Around line 581-589: The CLI currently accepts --sanity-image-path
unconditionally and assumes generated outputs have images, causing late failures
for video/non-image pipelines; update argument validation in quantize.py to
reject --sanity-image-path early when the selected pipeline type is not an image
pipeline: after parsing args (or inside the existing validation function / main
pipeline selection flow), detect the pipeline kind via the pipeline ID or class
name used for inference (the same symbol(s) that decide which pipeline to
instantiate) and raise an error or exit if --sanity-image-path is set but the
pipeline is not one of the known image pipelines (e.g., StableDiffusion/Any
Image* pipelines); apply the same guard for the second occurrence of this block
noted around the other lines so non-image pipelines fail at argument validation
time rather than after a full run.

In
`@examples/diffusers/quantization/qwen_image_svdquant/run_qwen_image_quantization.sh`:
- Around line 39-40: The script claims DRY_RUN previews commands but still
performs side effects and hard-fails on missing tokens; update logic so when
DRY_RUN is set (check DRY_RUN or use a helper dry_run() wrapper) you skip/avoid
any real file checks and mutations: only echo planned actions instead of
performing them, skip the HF_TOKEN_FILE existence/readability checks (do not
exit with error) and skip creating OUTPUT_DIR (do not run mkdir -p) and any file
writes; specifically wrap or conditionalize the HF_TOKEN_FILE checks (the
HF_TOKEN_FILE variable handling) and the mkdir -p or other filesystem operations
that create ${OUTPUT_DIR} so they only execute when DRY_RUN is not set, and
ensure any commands that would modify disk are printed when DRY_RUN=1 rather
than executed.

In `@modelopt/torch/export/unified_export_hf.py`:
- Around line 1024-1035: The code currently returns the first observed SVDQuant
rank via _detect_svdquant_rank by inspecting weight_quantizer.svdquant_lora_a,
which can hide inconsistencies across modules; update the logic to scan all
modules' weight_quantizer.svdquant_lora_a values, collect all unique ranks, and:
if none found return None, if exactly one unique rank use that, otherwise raise
or log an explicit error and refuse to write a single lora_rank metadata value.
Apply this validation before serializing the lora_rank metadata (where lora_rank
is written) so you never serialize an incorrect single rank when multiple
different ranks exist. Ensure you reference the same attributes
(weight_quantizer, svdquant_lora_a) and the _detect_svdquant_rank helper (or
replace it with a function that returns the set/validates) so callers can act on
the validation result.

In `@tests/_test_utils/torch/diffusers_models.py`:
- Line 296: Move the deferred "import inspect" (and any other imports added
inside tests) to the module/top-level in
tests/_test_utils/torch/diffusers_models.py (and the other referenced test
files: tests/examples/diffusers/test_qwen_block_range_recipe.py,
tests/examples/diffusers/test_export_diffusers_hf_ckpt.py,
tests/unit/torch/export/test_diffusers_qwen_export.py) so imports are
module-level by default; if an import truly must be deferred (circular or
optional dependency), keep it but add a one-line comment above the deferred
import explaining the specific reason and link to the offending symbol (e.g.,
the "import inspect" line) so reviewers can verify the justification.

---

Outside diff comments:
In `@examples/diffusers/quantization/quantize.py`:
- Around line 111-121: get_quant_config currently always injects
MODEL_DEFAULTS[QWEN_IMAGE]["block_range"] and
quantize_model/get_model_filter_func apply a separate hard-coded 60-block mask,
causing mismatch when backbone != transformer or transformer depth != 60; fix by
making the Qwen block-range mask backbone-aware and deriving it from the loaded
backbone (e.g., transformer_blocks, num_layers or backbone.config.*) as the
single source of truth: update get_quant_config to consult the actual backbone
type and depth and compute block_range via
build_block_range_quant_cfg(backbone_depth) instead of using
MODEL_DEFAULTS[QWEN_IMAGE]["block_range"], and update quantize_model and
get_model_filter_func to use that same computed mask (remove hard-coded masks in
examples/diffusers/quantization/utils.py) so all three locations
(get_quant_config, quantize_model, get_model_filter_func /
build_block_range_quant_cfg) reference the same backbone-derived value.

In `@modelopt/torch/export/unified_export_hf.py`:
- Around line 1174-1221: You promote quantizer-owned tensors with
_promote_quantizer_tensors_to_module but only call
_remove_promoted_quantizer_tensors on the success path, so exceptions during
save/postprocess/config update leave the module mutated; wrap the work that
occurs after promotion (the save path using hide_quantizers_from_state_dict +
component.save_pretrained or _save_component_state_dict_safetensors,
_postprocess_safetensors, and the config.json update that uses hf_quant_config)
in a try/finally and call _remove_promoted_quantizer_tensors(component) in the
finally so cleanup always runs; preserve and re-raise any exception after
cleanup to avoid swallowing errors.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c42d794b-9dd7-41d7-ad4c-25c69901c226

📥 Commits

Reviewing files that changed from the base of the PR and between d26c8af and c2250cb.

📒 Files selected for processing (18)

examples/diffusers/quantization/models_utils.py
examples/diffusers/quantization/pipeline_manager.py
examples/diffusers/quantization/quantize.py
examples/diffusers/quantization/qwen_image_svdquant/README.md
examples/diffusers/quantization/qwen_image_svdquant/run_qwen_image_quantization.sh
examples/diffusers/quantization/utils.py
modelopt/torch/export/convert_hf_config.py
modelopt/torch/export/diffusers_utils.py
modelopt/torch/export/quant_utils.py
modelopt/torch/export/unified_export_hf.py
tests/_test_utils/torch/diffusers_models.py
tests/examples/diffusers/conftest.py
tests/examples/diffusers/test_export_diffusers_hf_ckpt.py
tests/examples/diffusers/test_qwen_block_range_recipe.py
tests/examples/diffusers/test_qwen_pipeline_loading.py
tests/unit/torch/export/test_convert_hf_config_svdquant.py
tests/unit/torch/export/test_diffusers_qwen_export.py
tests/unit/torch/quantization/test_svdquant_forward_fold.py

coderabbitai · 2026-06-12T23:28:54Z

+    export_group.add_argument(
+        "--sanity-image-path",
+        type=str,
+        default=None,
+        help="If set, generate one image from the in-memory quantized pipeline (after "
+        "quantization, before the weights are packed for export) and save it here. This is "
+        "a quick functional sanity check of quantized inference; it does NOT reload the "
+        "exported checkpoint.",
+    )


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Reject --sanity-image-path for non-image pipelines at argument validation time.

This block assumes every supported model returns result.images[0], but the same CLI also supports video pipelines (LTX_*, WAN*). Today those runs will burn a full inference pass and then fail late on the save step instead of being rejected at the interface boundary.

Suggested guard

pipeline_manager.print_quant_summary() + if args.sanity_image_path and model_type in { + ModelType.LTX_VIDEO_DEV, + ModelType.LTX2, + ModelType.WAN22_T2V_14b, + ModelType.WAN22_T2V_5b, + }: + parser.error("--sanity-image-path is only supported for image pipelines.") + # Optional functional sanity check: generate one image from the in-memory # quantized pipeline. This runs BEFORE export (while weights are still # fake-quantized and runnable, not yet packed) and does not reload the

Also applies to: 729-750

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/diffusers/quantization/quantize.py` around lines 581 - 589, The CLI currently accepts --sanity-image-path unconditionally and assumes generated outputs have images, causing late failures for video/non-image pipelines; update argument validation in quantize.py to reject --sanity-image-path early when the selected pipeline type is not an image pipeline: after parsing args (or inside the existing validation function / main pipeline selection flow), detect the pipeline kind via the pipeline ID or class name used for inference (the same symbol(s) that decide which pipeline to instantiate) and raise an error or exit if --sanity-image-path is set but the pipeline is not one of the known image pipelines (e.g., StableDiffusion/Any Image* pipelines); apply the same guard for the second occurrence of this block noted around the other lines so non-image pipelines fail at argument validation time rather than after a full run.

coderabbitai · 2026-06-12T23:28:54Z

+def _detect_svdquant_rank(component: nn.Module) -> int | None:
+    """Return the SVDQuant low-rank dimension from the first SVDQuant linear, if any.
+
+    ``svdquant_lora_a`` has shape ``(rank, in_features)``, so its first dimension
+    is the low-rank size.
+    """
+    for _, sub_module in component.named_modules():
+        weight_quantizer = getattr(sub_module, "weight_quantizer", None)
+        lora_a = getattr(weight_quantizer, "svdquant_lora_a", None)
+        if lora_a is not None:
+            return int(lora_a.shape[0])
+    return None


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Validate SVDQuant rank consistency before writing lora_rank metadata.

On Line 1024, _detect_svdquant_rank() returns the first observed rank. If different quantized modules carry different svdquant_lora_a ranks, Line 1185-1188 will serialize a single incorrect lora_rank, which can misrepresent the exported checkpoint contract.

Proposed fix

def _detect_svdquant_rank(component: nn.Module) -> int | None: @@ - for _, sub_module in component.named_modules(): + ranks: set[int] = set() + for _, sub_module in component.named_modules(): weight_quantizer = getattr(sub_module, "weight_quantizer", None) lora_a = getattr(weight_quantizer, "svdquant_lora_a", None) if lora_a is not None: - return int(lora_a.shape[0]) - return None + ranks.add(int(lora_a.shape[0])) + if not ranks: + return None + if len(ranks) != 1: + raise ValueError(f"Inconsistent SVDQuant ranks detected across modules: {sorted(ranks)}") + return next(iter(ranks))

Also applies to: 1185-1188

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modelopt/torch/export/unified_export_hf.py` around lines 1024 - 1035, The code currently returns the first observed SVDQuant rank via _detect_svdquant_rank by inspecting weight_quantizer.svdquant_lora_a, which can hide inconsistencies across modules; update the logic to scan all modules' weight_quantizer.svdquant_lora_a values, collect all unique ranks, and: if none found return None, if exactly one unique rank use that, otherwise raise or log an explicit error and refuse to write a single lora_rank metadata value. Apply this validation before serializing the lora_rank metadata (where lora_rank is written) so you never serialize an incorrect single rank when multiple different ranks exist. Ensure you reference the same attributes (weight_quantizer, svdquant_lora_a) and the _detect_svdquant_rank helper (or replace it with a function that returns the set/validates) so callers can act on the validation result.

Remove the standalone Qwen test files. The fp8/nvfp4/svdquant cases in test_export_diffusers_hf_ckpt.py already cover the block-range recipe (only transformer_blocks 2..57 quantized), the promoted SVDQuant keys + pre_quant_scale, the NVFP4_SVD quantization_config, and the no-leak check -- matching how SDXL/Flux/Wan are tested in the same file. Core SVDQuant forward/fold is unchanged and remains covered by existing upstream tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

…motion Covers svdquant calibration -> _promote_quantizer_tensors_to_module -> clean module-level keys (svdquant_lora_a/b, pre_quant_scale) with the quantizers hidden, plus the post-export cleanup. Runs on CPU in <1s (INT8_SMOOTHQUANT + svdquant on a tiny linear stack). The full NVFP4 end-to-end check remains test_qwen_image_hf_ckpt_export[qwen_nvfp4_svdquant]; svdquant calibration is already covered by test_calib.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Jingyu Xin <jingyux@nvidia.com>

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/torch/export/test_export_diffusers.py`:
- Around line 132-137: Move the local imports into the module-level import
section: take the symbols "copy", "torch.nn as nn", "modelopt.torch.quantization
as mtq", and "hide_quantizers_from_state_dict" and add them with the other
top-of-file imports (after the existing imports around line ~32), then remove
the in-function imports currently present in the test body; this ensures the
imports are executed at collection time and preserves the same symbol names used
in the test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0a915fc4-6111-479e-a3c9-18d6c9db6bd4

📥 Commits

Reviewing files that changed from the base of the PR and between c2250cb and 9b472b2.

📒 Files selected for processing (1)

tests/unit/torch/export/test_export_diffusers.py

coderabbitai · 2026-06-12T23:44:41Z

+    import copy
+
+    import torch.nn as nn
+
+    import modelopt.torch.quantization as mtq
+    from modelopt.torch.export.diffusers_utils import hide_quantizers_from_state_dict


📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Move imports to the top of the file.

Per test coding guidelines, imports inside functions or test methods require explicit justification (e.g., circular imports or optional dependencies like TensorRT-LLM/Megatron-Core). None of these imports (copy, torch.nn, modelopt.torch.quantization, hide_quantizers_from_state_dict) are optional dependencies or resolve circular imports. Moving them to the top ensures import errors surface at collection time instead of mid-test.

📦 Suggested fix

Move these imports to the top of the file with the other imports (after line 32):

from modelopt.torch.export.convert_hf_config import convert_hf_quant_config_format from modelopt.torch.export.diffusers_utils import generate_diffusion_dummy_inputs from modelopt.torch.export.unified_export_hf import export_hf_checkpoint +import copy +import torch.nn as nn +import modelopt.torch.quantization as mtq +from modelopt.torch.export.diffusers_utils import hide_quantizers_from_state_dict

Then remove the in-function imports (lines 132-137).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/torch/export/test_export_diffusers.py` around lines 132 - 137, Move the local imports into the module-level import section: take the symbols "copy", "torch.nn as nn", "modelopt.torch.quantization as mtq", and "hide_quantizers_from_state_dict" and add them with the other top-of-file imports (after the existing imports around line ~32), then remove the in-function imports currently present in the test body; this ensures the imports are executed at collection time and preserves the same symbol names used in the test.

Source: Coding guidelines

codecov · 2026-06-12T23:49:27Z

Codecov Report

❌ Patch coverage is 51.35135% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.73%. Comparing base (9f37fe1) to head (4dff6d5).

Files with missing lines	Patch %	Lines
modelopt/torch/export/diffusers_utils.py	40.74%	16 Missing ⚠️
modelopt/torch/export/convert_hf_config.py	0.00%	10 Missing ⚠️
modelopt/torch/export/unified_export_hf.py	72.97%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1706      +/-   ##
==========================================
- Coverage   77.12%   67.73%   -9.40%     
==========================================
  Files         511      511              
  Lines       56236    56300      +64     
==========================================
- Hits        43370    38132    -5238     
- Misses      12866    18168    +5302

Flag	Coverage Δ
unit	`54.39% <51.35%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jingyu-ml and others added 12 commits June 11, 2026 16:50

jingyu-ml requested review from a team as code owners June 12, 2026 23:18

jingyu-ml requested review from kevalmorabia97 and shengliangxu June 12, 2026 23:18

jingyu-ml marked this pull request as draft June 12, 2026 23:19

jingyu-ml changed the title ~~Feature/qwen image svdquant nvfp4~~ Qwen-Image diffusers PTQ: FP8 / NVFP4 / NVFP4-SVDQuant HF checkpoints Jun 12, 2026

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

jingyu-ml marked this pull request as ready for review June 12, 2026 23:30

jingyu-ml and others added 2 commits June 12, 2026 16:32

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Merge branch 'main' into feature/qwen-image-svdquant-nvfp4

4dff6d5

Conversation

jingyu-ml commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Follow-up (next step): fused-QKV SVDQuant for sglang / Nunchaku

Uh oh!

copy-pr-bot Bot commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Built to branch gh-pages at 2026-06-14 00:11 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jingyu-ml commented Jun 12, 2026 •

edited

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

github-actions Bot commented Jun 12, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-06-14 00:11 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

codecov Bot commented Jun 12, 2026 •

edited

Loading