Skip to content

build: bump gptqmodel from 4.0.0.dev0+cu126torch2.7 to 6.0.3#615

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/gptqmodel-6.0.3
Open

build: bump gptqmodel from 4.0.0.dev0+cu126torch2.7 to 6.0.3#615
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/gptqmodel-6.0.3

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot bot commented on behalf of github Apr 3, 2026

Bumps gptqmodel from 4.0.0.dev0+cu126torch2.7 to 6.0.3.

Release notes

Sourced from gptqmodel's releases.

GPT-QModel v6.0.3

Notable Changes:

Quantization and inference

  • Major ParoQuant improvements across speed, inference, and accuracy.
  • Added Paro inference support and a new layer optimizer.
  • Auto-enables AMP for the fast Paro implementation to better match reference behavior.
  • Added Paro rotation autotuning and fixed BF16 rotation support for the fused CUDA kernel.
  • Improved Paro stability with seeding fixes, cleanup, learned channel scale clamping, and contiguous tensor handling fixes.
  • Fixed a layer output replay/re-capture regression.
  • Added FOEM (First-Order Error Matters) for more accurate quantized LLM compensation, plus follow-up fixes to its data processing pipeline.
  • Replaced the old marlin_fp16 backend behavior with environment-flag control for FP32 reduction.

Model and backend support

  • Added support for Gemma4, MiniCPMO, MiniCPMV, and GLM4-MoE-Lite.
  • Added PrismML/Bonsai model support for inference.
  • Fixed Qwen3_5QModel definition issues.
  • Fixed Qwen 3.5 rotary embedding behavior.
  • Fixed AWQ layer grouping for qwen3_5_moe, llama4, qwen2_moe, and qwen3_next.
  • Fixed awq_processor.dynamic so skipped layers are handled correctly.
  • Improved dtype compatibility.
  • Hugging Face kernels are now gated off on Python no-GIL builds until upstream wheel support is fixed.

Evaluation, calibration, and usability

  • Integrated Evalution into the workflow.
  • Added evalution.VLLM and evalution.SGLang backends.
  • Fixed SGLang evaluation engine initialization.
  • Automatically determines MODEL_COMPAT_FAST_LAYER_COUNT.
  • Improved calibration data device handling.
  • Updated tokenizer handling, and collation now respects tokenizer padding_size.
  • Improved import performance by lazy-loading _DEVICE_THREAD_POOL.
  • Cleaned up warning behavior and added an option to suppress warnings.
  • Removed forced random seed overrides.

Dependency and compatibility updates

  • Updated pypcre to 0.2.14.
  • Pinned logbar to >=0.4.1.
  • Updated transformers and defuser package versions.
  • Fixed SAVE_PATH handling and import path resolution issues.

Breaking and removed

  • Removed GPTQModel.upload_to_hub().
  • Removed MLX export support.

What's Changed

... (truncated)

Commits

@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 3, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Bumps [gptqmodel](https://github.com/ModelCloud/GPTQModel) from 4.0.0.dev0+cu126torch2.7 to 6.0.3.
- [Release notes](https://github.com/ModelCloud/GPTQModel/releases)
- [Commits](https://github.com/ModelCloud/GPTQModel/commits/v6.0.3)

---
updated-dependencies:
- dependency-name: gptqmodel
  dependency-version: 6.0.3
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot force-pushed the dependabot/pip/gptqmodel-6.0.3 branch from 4d7cac3 to fdd6b62 Compare April 8, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants