-
Notifications
You must be signed in to change notification settings - Fork 121
Comparing changes
Open a pull request
base repository: ModelCloud/GPTQModel
base: v0.9.4
head repository: ModelCloud/GPTQModel
compare: v0.9.5
- 13 commits
- 64 files changed
- 5 contributors
Commits on Jul 4, 2024
-
Configuration menu - View commit details
-
Copy full SHA for fb388f3 - Browse repository at this point
Copy the full SHA fb388f3View commit details -
[CI] FIX test perplexity fail (#160)
* fix not defined error * fix test_perplexity fail * modify dataset filter text length * modify assert the difference of ppl * modify dataset filter with text length
Configuration menu - View commit details
-
Copy full SHA for d5c1024 - Browse repository at this point
Copy the full SHA d5c1024View commit details -
[REFRACTOR] Remove Backend.CUDA and Backend.CUDA_OLD (#165)
* remove Backend.CUDA and Backend.CUDA_OLD * fix unit test * remove cuda_64/ and cuda_256/
Configuration menu - View commit details
-
Copy full SHA for 6f1eb58 - Browse repository at this point
Copy the full SHA 6f1eb58View commit details
Commits on Jul 5, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b250a76 - Browse repository at this point
Copy the full SHA b250a76View commit details -
[BACKEND] Add QBits support (#137)
* Support QBits kernel for CPU device Signed-off-by: Cheng Penghui <penghui.cheng@intel.com> * fix merge * format * fix merge * rename to meet with latest main style * rename to meet with latest main style * fix doc * revert commented codes * add warning for fallback to cpu * remove unneeded var * fix merge * get gpu from curl * update url & use matrix * revert to main * update codes with pr comments * no 2 bit * set min to 1.4.2 * fix name * add test * remove cpu check, model.device is CPU, so it cause wrong type check there * remove cpu check, model.device is CPU, so it cause wrong type check there * temp disable cuda check * add cpu check back * check module type like main * fix torch_dtype wrong which caused qbits not work * check bits support with BITS_DTYPE_MAPPING * add qbits test * add qbit test to ci * remove for now * delete test_qbits_kernel.py, it can't pass all 4 bit tests * remove cpu check again.. not sure what it is * add qbits in format tests * move test_qbits to test_cpu * no need container * setup python * update cuda check * set python to 3.10 * fix check * update runner * update runner * disable download other run's artifact * set --durations=0 * quant_type removed from main * quant_type removed * override device=cpu for qbits qbits must be explicit and we do not auto switch to qbits when device=cpu. we do the reverse, and force device=cpu and backend set to qbits * Update base.py * Update qlinear_qbits.py * qbits supports 2, 3, 4, 8 bits * Update qlinear_qbits.py * reverse/rename asym into sym * ruff * rename * rename * load qbits only as needed * cleanup * cleanup * fix device override for qbits * cleanup * cuda has been removed * format * fix check condition * fix qbits RuntimeError * fix qbits RuntimeError * remove todo * add protobuf in req & remove buggy download artifact with runid: actions/download-artifact#295 * ruff --------- Signed-off-by: Cheng Penghui <penghui.cheng@intel.com> Co-authored-by: Cheng Penghui <penghui.cheng@intel.com> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Configuration menu - View commit details
-
Copy full SHA for b39fa13 - Browse repository at this point
Copy the full SHA b39fa13View commit details -
[FIX] Delete 8 bits test (#169)
* revert comment * remove 8 bits test
Configuration menu - View commit details
-
Copy full SHA for efb77a2 - Browse repository at this point
Copy the full SHA efb77a2View commit details -
[MODEL] Add 2 & 3 bits support for QBits (#170)
* add 2 & 3 bits * update SUPPORT_BITS
Configuration menu - View commit details
-
Copy full SHA for 03bd744 - Browse repository at this point
Copy the full SHA 03bd744View commit details -
Configuration menu - View commit details
-
Copy full SHA for 87ef93f - Browse repository at this point
Copy the full SHA 87ef93fView commit details -
[CI] [FIX] used wrong tokenizer get dataset (#171)
* fix not defined error * fix test_perplexity fail * modify dataset filter text length * modify assert the difference of ppl * modify dataset filter with text length * fix use wrong tokenizer get dataset * simplify code
Configuration menu - View commit details
-
Copy full SHA for 61191d5 - Browse repository at this point
Copy the full SHA 61191d5View commit details -
[FEATURE] BaseQuantLinear add SUPPORTED_DEVICES (#174)
* Check QuantLinear Device * cleanup * REFRACTOR check_cuda by introducing SUPPORTED_DEVICE into BaseQuantLinear * make device type cuda/cpu an enum * cleanup * cleanup
Configuration menu - View commit details
-
Copy full SHA for 6c35fd8 - Browse repository at this point
Copy the full SHA 6c35fd8View commit details -
[MODEL] Add quant support for Qbits (#173)
* add quant support for qbits * test quant with qbits * set real sym back to quantize_config * Update qlinear_qbits.py --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Configuration menu - View commit details
-
Copy full SHA for 8b3c1d3 - Browse repository at this point
Copy the full SHA 8b3c1d3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 50aa90a - Browse repository at this point
Copy the full SHA 50aa90aView commit details -
Configuration menu - View commit details
-
Copy full SHA for f0a1ee8 - Browse repository at this point
Copy the full SHA f0a1ee8View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v0.9.4...v0.9.5