Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

imatrix : use GGUF to store importance matrices #9400

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
Loading
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
bce5464
imatrix : allow processing multiple chunks per batch
compilade Aug 20, 2024
347247a
imatrix : fix segfault when using a single chunk per batch
compilade Aug 20, 2024
3de9300
imatrix : use GGUF to store imatrix data
compilade Sep 6, 2024
c8ab6a3
imatrix : fix conversion problems
compilade Sep 8, 2024
3ad0603
Merge branch 'master' into compilade/imatrix-batched-chunks
compilade Sep 8, 2024
d19101c
imatrix : use FMA and sort tensor names
compilade Sep 8, 2024
503630e
py : add requirements for legacy imatrix convert script
compilade Sep 10, 2024
9e6b0e9
perplexity : revert changes
compilade Sep 10, 2024
894ed8d
py : include imatrix converter requirements in toplevel requirements
compilade Sep 10, 2024
efa9186
imatrix : avoid using designated initializers in C++
compilade Sep 10, 2024
2217247
imatrix : remove unused n_entries
compilade Sep 10, 2024
8c13e16
imatrix : allow loading mis-ordered tensors
compilade Sep 10, 2024
2d79a70
quantize : use unused imatrix chunk_size with LLAMA_TRACE
compilade Sep 10, 2024
c7a32e7
common : use GGUF for imatrix output by default
compilade Jan 31, 2025
db502dd
Merge branch 'master' into compilade/imatrix-batched-chunks
compilade Feb 9, 2025
1be357d
Merge branch 'master' into compilade/imatrix-batched-chunks
compilade Feb 9, 2025
16202d6
Merge branch 'master' into compilade/imatrix-batched-chunks
compilade Apr 13, 2025
a5165a6
imatrix : two-way conversion between old format and GGUF
compilade Apr 15, 2025
635f945
convert : remove imatrix to gguf python script
compilade Apr 15, 2025
1d19025
imatrix : use the function name in more error messages
compilade Apr 15, 2025
2c09450
Merge branch 'master' into compilade/imatrix-batched-chunks
compilade Jun 18, 2025
ba6f6be
imatrix : don't use FMA explicitly
compilade Jun 18, 2025
1a9454a
imatrix : avoid returning from void function save_imatrix
compilade Jun 18, 2025
43cd2b3
imatrix : support 3d tensors with MUL_MAT
compilade Jun 23, 2025
0e79355
quantize : fix dataset name loading from gguf imatrix
compilade Jun 23, 2025
118d52f
Merge branch 'master' into compilade/imatrix-batched-chunks
compilade Jun 23, 2025
e33de12
common : move string_remove_suffix from quantize and imatrix
compilade Jun 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions 9 common/common.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,15 @@ void string_replace_all(std::string & s, const std::string & search, const std::
bool string_ends_with(const std::string_view & str, const std::string_view & suffix) {
return str.size() >= suffix.size() && str.compare(str.size()-suffix.size(), suffix.size(), suffix) == 0;
}

bool string_remove_suffix(std::string & str, const std::string_view & suffix) {
bool has_suffix = string_ends_with(str, suffix);
if (has_suffix) {
str = str.substr(0, str.size() - suffix.size());
}
return has_suffix;
}

size_t string_find_partial_stop(const std::string_view & str, const std::string_view & stop) {
if (!str.empty() && !stop.empty()) {
const char text_last_char = str.back();
Expand Down
1 change: 1 addition & 0 deletions 1 common/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -518,6 +518,7 @@ static bool string_starts_with(const std::string & str,

// While we wait for C++20's std::string::ends_with...
bool string_ends_with(const std::string_view & str, const std::string_view & suffix);
bool string_remove_suffix(std::string & str, const std::string_view & suffix);
size_t string_find_partial_stop(const std::string_view & str, const std::string_view & stop);

bool string_parse_kv_override(const char * data, std::vector<llama_model_kv_override> & overrides);
Expand Down
6 changes: 6 additions & 0 deletions 6 gguf-py/gguf/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,11 @@ class Adapter:
TYPE = "adapter.type"
LORA_ALPHA = "adapter.lora.alpha"

class IMatrix:
CHUNK_COUNT = "imatrix.chunk_count"
CHUNK_SIZE = "imatrix.chunk_size"
DATASETS = "imatrix.datasets"

class Clip:
PROJECTOR_TYPE = "clip.projector_type"
HAS_VISION_ENCODER = "clip.has_vision_encoder"
Expand Down Expand Up @@ -272,6 +277,7 @@ class Projector:
class GGUFType:
MODEL = "model"
ADAPTER = "adapter"
IMATRIX = "imatrix"
MMPROJ = "mmproj" # dummy, unused for now


Expand Down
Loading
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.