GPTQModel v2.2.0

What's Changed

✨ New Qwen 2.5 VL model support. Prelim Qwen 3 model support.
✨ New samples log column during quantization to track module activation in MoE models.
✨ Loss log column now color-coded to highlight modules that are friendly/resistant to quantization.
✨ Progress (per-step) stats during quantization now streamed to log file.
✨ Auto bfloat16 dtype loading for models based on model config.
✨ Fix kernel compile for Pytorch/ROCm.
✨ Slightly faster quantization and auto-resolve some low-level oom issues for smaller vram gpus.

Enable ipex tests for CPU/XPU by @jiqing-feng in #1460
test kernel accuracies with more shapes on cuda by @Qubitium in #1461
Fix rocm flags by @Qubitium in #1467
use table like logging format by @Qubitium in #1471
stream process log entries to persistent file by @Qubitium in #1472
fix some models need trust-remote-code arg by @Qubitium in #1474
Fix wq dtype by @Qubitium in #1475
add colors to quant loss column by @Qubitium in #1477
add prelim qwen3 support by @Qubitium in #1478
Update eora.py for further optimization by @nbasyl in #1488
faster cholesky inverse and avoid oom when possible by @Qubitium in #1494
[MODEL] supports qwen2_5_vl by @ZX-ModelCloud in #1493

Full Changelog: v2.1.0...v2.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPTQModel v2.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Search code, repositories, users, issues, pull requests...

GPTQModel v2.2.0

What's Changed

Contributors

Uh oh!