Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 2 additions & 55 deletions examples/diffusers/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Diffusers Model Optimizations

Model Optimizer supports techniques like Cache Diffusion and Quantization for Diffusion models, along with scripts to evaluate models using popular evaluation metrics.
Model Optimizer supports techniques like Cache Diffusion and Quantization for Diffusion models.

Post-training quantization (PTQ) is an effective model optimization technique that compresses your models to lower precision like INT8, FP8, NVFP4, etc. Quantization with Model Optimizer can compress model size by 2x-4x, speeding up inference while preserving model quality. Quantization-Aware Training (QAT) is a powerful technique for optimizing your models, particularly when PTQ methods fail to meet the requirements for your tasks.

Expand All @@ -20,7 +20,6 @@ Cache Diffusion is a technique that reuses cached outputs from previous diffusio
| Quantization Aware Distillation (QAD) | Example scripts on how to run QAD on diffusion models | \[[Link](#quantization-aware-distillation-qad)\] | \[[docs](https://nvidia.github.io/Model-Optimizer/guides/1_quantization.html)\] |
| Build and Run with TensorRT | How to build and run your quantized model with TensorRT | \[[Link](#build-and-run-with-tensorrt-compiler-framework)\] | |
| LoRA | Fuse your LoRA weights prior to quantization | \[[Link](#lora)\] | |
| Evaluate Accuracy | Evaluate your model's accuracy! | \[[Link](#evaluate-accuracy)\] | |
| Pre-Quantized Checkpoints | Ready to deploy Hugging Face pre-quantized checkpoints | \[[Link](#pre-quantized-checkpoints)\] | |
| Resources | Extra links to relevant resources | \[[Link](#resources)\] | |

Expand All @@ -43,7 +42,7 @@ pip install nvidia-modelopt[onnx,hf]
pip install -r requirements.txt
```

Each subsection (eval, etc.) may have their own `requirements.txt` file that needs to be installed separately.
Each subsection (fastgen, distillation, etc.) may have their own `requirements.txt` file that needs to be installed separately.

You can find the latest TensorRT [here](https://developer.nvidia.com/tensorrt/download).

Expand Down Expand Up @@ -472,58 +471,6 @@ Comparing with naively reducing the generation steps, cache diffusion can achiev

Stable Diffusion pipelines rely heavily on random sampling operations, which include creating Gaussian noise tensors to denoise and adding noise in the scheduling step. In the quantization recipe, we don't fix the random seed. As a result, every time you run the calibration pipeline, you could get different quantizer amax values. This may lead to the generated images being different from the ones generated with the original model. We suggest to run a few more times and choose the best one.

## Evaluate Accuracy

This simple code demonstrates how to evaluate images generated by diffusion (or other generative) models using popular metrics such as [imagereward](https://arxiv.org/abs/2304.05977), [clip-iqa](https://arxiv.org/abs/2207.12396), and [clip](https://arxiv.org/abs/2104.08718).

### Install Requirements

```bash
pip install -r eval/requirments.txt
```

### Data Format

Prepare a JSON file with your prompts and corresponding images in the structure below:

```json
[
{
"prompt": "YOUR_PROMPT",
"images": {
"MODEL_NAME": "PATH_TO_THE_IMAGE",
"MODEL_NAME": "PATH_TO_THE_IMAGE",
...
}
},
...
]
```

- `prompt`: The text prompt used to generate the images.
- `images`: Key-value pairs of model names and image file paths.

### Evaluate

Run the evaluation script with your JSON file:

```bash
python eval/main.py --data-path {PATH_TO_THE_IMAGE_JSON_PATH} --metrics imagereward
```

- `--data-path`: Path to your JSON file.
- `--metrics`: One or more metrics to compute (e.g. imagereward, clip-iqa, clip).

### Sample results

Example metrics obtained with 30 sampling steps on a set of 1K prompts (values will vary based on data and model configurations):

| Model | Precision | ImageReward | CLIP-IQA | CLIP |
|:------------:|:------------:|:------------:|:------------:|:------------:|
| FLUX 1 Dev | BF16 | 1.118 | 0.927 | 30.15 |
| | FP4 PTQ | 1.096 | 0.923 | 29.86 |
| | FP4 QAT | 1.119 | 0.928 | 29.919 |

## Pre-Quantized Checkpoints

- Ready-to-deploy checkpoints \[[🤗 Hugging Face - Black Forest Labs](https://huggingface.co/black-forest-labs)\]
Expand Down
59 changes: 0 additions & 59 deletions examples/diffusers/eval/main.py

This file was deleted.

36 changes: 0 additions & 36 deletions examples/diffusers/eval/metrics/imagereward.py

This file was deleted.

50 changes: 0 additions & 50 deletions examples/diffusers/eval/metrics/multimodal.py

This file was deleted.

2 changes: 0 additions & 2 deletions examples/diffusers/eval/requirements.txt

This file was deleted.

57 changes: 0 additions & 57 deletions examples/diffusers/eval/utils.py

This file was deleted.

Loading