Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 8800226

Browse filesBrowse files
authored
Fix --split-max-size (ggml-org#6655)
* Fix --split-max-size Byte size calculation was done on int and overflowed. * add tests.sh * add examples test scripts to ci run Will autodiscover examples/*/tests.sh scripts and run them. * move WORK_PATH to a subdirectory * clean up before and after test * explicitly define which scripts to run * add --split-max-size to readme
1 parent e689fc4 commit 8800226
Copy full SHA for 8800226

File tree

Expand file treeCollapse file tree

4 files changed

+141
-2
lines changed
Open diff view settings
Filter options
Expand file treeCollapse file tree

4 files changed

+141
-2
lines changed
Open diff view settings
Collapse file

‎ci/run.sh‎

Copy file name to clipboardExpand all lines: ci/run.sh
+49Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,52 @@ function gg_sum_ctest_release {
153153
gg_printf '```\n'
154154
}
155155

156+
# test_scripts_debug
157+
158+
function gg_run_test_scripts_debug {
159+
cd ${SRC}
160+
161+
set -e
162+
163+
(cd ./examples/gguf-split && time bash tests.sh "$SRC/build-ci-debug/bin" "$MNT/models") 2>&1 | tee -a $OUT/${ci}-scripts.log
164+
165+
set +e
166+
}
167+
168+
function gg_sum_test_scripts_debug {
169+
gg_printf '### %s\n\n' "${ci}"
170+
171+
gg_printf 'Runs test scripts in debug mode\n'
172+
gg_printf '- status: %s\n' "$(cat $OUT/${ci}.exit)"
173+
gg_printf '```\n'
174+
gg_printf '%s\n' "$(cat $OUT/${ci}-scripts.log)"
175+
gg_printf '```\n'
176+
gg_printf '\n'
177+
}
178+
179+
# test_scripts_release
180+
181+
function gg_run_test_scripts_release {
182+
cd ${SRC}
183+
184+
set -e
185+
186+
(cd ./examples/gguf-split && time bash tests.sh "$SRC/build-ci-release/bin" "$MNT/models") 2>&1 | tee -a $OUT/${ci}-scripts.log
187+
188+
set +e
189+
}
190+
191+
function gg_sum_test_scripts_release {
192+
gg_printf '### %s\n\n' "${ci}"
193+
194+
gg_printf 'Runs test scripts in release mode\n'
195+
gg_printf '- status: %s\n' "$(cat $OUT/${ci}.exit)"
196+
gg_printf '```\n'
197+
gg_printf '%s\n' "$(cat $OUT/${ci}-scripts.log)"
198+
gg_printf '```\n'
199+
gg_printf '\n'
200+
}
201+
156202
function gg_get_model {
157203
local gguf_3b="$MNT/models/open-llama/3B-v2/ggml-model-f16.gguf"
158204
local gguf_7b="$MNT/models/open-llama/7B-v2/ggml-model-f16.gguf"
@@ -642,6 +688,9 @@ test $ret -eq 0 && gg_run ctest_release
642688
if [ -z ${GG_BUILD_LOW_PERF} ]; then
643689
test $ret -eq 0 && gg_run embd_bge_small
644690

691+
test $ret -eq 0 && gg_run test_scripts_debug
692+
test $ret -eq 0 && gg_run test_scripts_release
693+
645694
if [ -z ${GG_BUILD_VRAM_GB} ] || [ ${GG_BUILD_VRAM_GB} -ge 8 ]; then
646695
if [ -z ${GG_BUILD_CUDA} ]; then
647696
test $ret -eq 0 && gg_run open_llama_3b_v2
Collapse file

‎examples/gguf-split/README.md‎

Copy file name to clipboardExpand all lines: examples/gguf-split/README.md
+1Lines changed: 1 addition & 0 deletions
  • Display the source diff
  • Display the rich diff
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@ CLI to split / merge GGUF files.
55
**Command line options:**
66

77
- `--split`: split GGUF to multiple GGUF, default operation.
8+
- `--split-max-size`: max size per split in `M` or `G`, f.ex. `500M` or `2G`.
89
- `--split-max-tensors`: maximum tensors in each split: default(128)
910
- `--merge`: merge multiple GGUF to a single GGUF.
Collapse file

‎examples/gguf-split/gguf-split.cpp‎

Copy file name to clipboardExpand all lines: examples/gguf-split/gguf-split.cpp
+2-2Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,10 @@ static size_t split_str_to_n_bytes(std::string str) {
5959
int n;
6060
if (str.back() == 'M') {
6161
sscanf(str.c_str(), "%d", &n);
62-
n_bytes = n * 1024 * 1024; // megabytes
62+
n_bytes = (size_t)n * 1024 * 1024; // megabytes
6363
} else if (str.back() == 'G') {
6464
sscanf(str.c_str(), "%d", &n);
65-
n_bytes = n * 1024 * 1024 * 1024; // gigabytes
65+
n_bytes = (size_t)n * 1024 * 1024 * 1024; // gigabytes
6666
} else {
6767
throw std::invalid_argument("error: supported units are M (megabytes) or G (gigabytes), but got: " + std::string(1, str.back()));
6868
}
Collapse file

‎examples/gguf-split/tests.sh‎

Copy file name to clipboard
+89Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
#!/bin/bash
2+
3+
set -eu
4+
5+
if [ $# -lt 1 ]
6+
then
7+
echo "usage: $0 path_to_build_binary [path_to_temp_folder]"
8+
echo "example: $0 ../../build/bin ../../tmp"
9+
exit 1
10+
fi
11+
12+
if [ $# -gt 1 ]
13+
then
14+
TMP_DIR=$2
15+
else
16+
TMP_DIR=/tmp
17+
fi
18+
19+
set -x
20+
21+
SPLIT=$1/gguf-split
22+
MAIN=$1/main
23+
WORK_PATH=$TMP_DIR/gguf-split
24+
CUR_DIR=$(pwd)
25+
26+
mkdir -p "$WORK_PATH"
27+
28+
# Clean up in case of previously failed test
29+
rm -f $WORK_PATH/ggml-model-split*.gguf $WORK_PATH/ggml-model-merge*.gguf
30+
31+
# 1. Get a model
32+
(
33+
cd $WORK_PATH
34+
"$CUR_DIR"/../../scripts/hf.sh --repo ggml-org/gemma-1.1-2b-it-Q8_0-GGUF --file gemma-1.1-2b-it.Q8_0.gguf
35+
)
36+
echo PASS
37+
38+
# 2. Split with max tensors strategy
39+
$SPLIT --split-max-tensors 28 $WORK_PATH/gemma-1.1-2b-it.Q8_0.gguf $WORK_PATH/ggml-model-split
40+
echo PASS
41+
echo
42+
43+
# 2b. Test the sharded model is loading properly
44+
$MAIN --model $WORK_PATH/ggml-model-split-00001-of-00006.gguf --random-prompt --n-predict 32
45+
echo PASS
46+
echo
47+
48+
# 3. Merge
49+
$SPLIT --merge $WORK_PATH/ggml-model-split-00001-of-00006.gguf $WORK_PATH/ggml-model-merge.gguf
50+
echo PASS
51+
echo
52+
53+
# 3b. Test the merged model is loading properly
54+
$MAIN --model $WORK_PATH/ggml-model-merge.gguf --random-prompt --n-predict 32
55+
echo PASS
56+
echo
57+
58+
# 4. Split with no tensor in metadata
59+
#$SPLIT --split-max-tensors 32 --no-tensor-in-metadata $WORK_PATH/ggml-model-merge.gguf $WORK_PATH/ggml-model-split-32-tensors
60+
#echo PASS
61+
#echo
62+
63+
# 4b. Test the sharded model is loading properly
64+
#$MAIN --model $WORK_PATH/ggml-model-split-32-tensors-00001-of-00006.gguf --random-prompt --n-predict 32
65+
#echo PASS
66+
#echo
67+
68+
# 5. Merge
69+
#$SPLIT --merge $WORK_PATH/ggml-model-split-32-tensors-00001-of-00006.gguf $WORK_PATH/ggml-model-merge-2.gguf
70+
#echo PASS
71+
#echo
72+
73+
# 5b. Test the merged model is loading properly
74+
#$MAIN --model $WORK_PATH/ggml-model-merge-2.gguf --random-prompt --n-predict 32
75+
#echo PASS
76+
#echo
77+
78+
# 6. Split with size strategy
79+
$SPLIT --split-max-size 2G $WORK_PATH/ggml-model-merge.gguf $WORK_PATH/ggml-model-split-2G
80+
echo PASS
81+
echo
82+
83+
# 6b. Test the sharded model is loading properly
84+
$MAIN --model $WORK_PATH/ggml-model-split-2G-00001-of-00002.gguf --random-prompt --n-predict 32
85+
echo PASS
86+
echo
87+
88+
# Clean up
89+
rm -f $WORK_PATH/ggml-model-split*.gguf $WORK_PATH/ggml-model-merge*.gguf

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.