Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit cf4931a

Browse filesBrowse files
committed
Working Open Llama 3B in a box
1 parent 217d783 commit cf4931a
Copy full SHA for cf4931a

File tree

Expand file treeCollapse file tree

6 files changed

+64
-14
lines changed
Filter options
Expand file treeCollapse file tree

6 files changed

+64
-14
lines changed

‎docker/README.md

Copy file name to clipboardExpand all lines: docker/README.md
+3-2Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
- `Dockerfile` - a single OpenBLAS and CuBLAS combined Dockerfile that automatically installs a previously downloaded model `model.bin`
2525

2626
## Download a Llama Model from Hugging Face
27-
- To download a MIT licensed Llama model run: `python3 ./hug_model.py -a vihangd -s open_llama_7b_700bt_ggml`
27+
- To download a MIT licensed Llama model you can run: `python3 ./hug_model.py -a vihangd -s open_llama_7b_700bt_ggml -f ggml-model-q5_1.bin`
2828
- To select and install a restricted license Llama model run: `python3 ./hug_model.py -a TheBloke -t llama`
2929
- You should now have a model in the current directory and `model.bin` symlinked to it for the subsequent Docker build and copy step. e.g.
3030
```
@@ -37,9 +37,10 @@ lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>q5_
3737

3838
| Model | Quantized size |
3939
|------:|----------------:|
40+
| 3B | 3 GB |
4041
| 7B | 5 GB |
4142
| 13B | 10 GB |
42-
| 30B | 25 GB |
43+
| 33B | 25 GB |
4344
| 65B | 50 GB |
4445

4546
**Note #2:** If you want to pass or tune additional parameters, customise `./start_server.sh` before running `docker build ...`
File renamed without changes.

‎docker/open_llama/build.sh

Copy file name to clipboard
+14Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#!/bin/sh
2+
3+
MODEL="open_llama_3b"
4+
# Get open_llama_3b_ggml q5_1 quantization
5+
python3 ./hug_model.py -a SlyEcho -s ${MODEL} -f "q5_1"
6+
ls -lh *.bin
7+
8+
# Build the default OpenBLAS image
9+
docker build -t $MODEL .
10+
docker images | egrep "^(REPOSITORY|$MODEL)"
11+
12+
echo
13+
echo "To start the docker container run:"
14+
echo "docker run -t -p 8000:8000 $MODEL"

‎docker/auto_docker/hug_model.py renamed to ‎docker/open_llama/hug_model.py

Copy file name to clipboardExpand all lines: docker/open_llama/hug_model.py
+18-11Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -76,21 +76,23 @@ def main():
7676

7777
# Arguments
7878
parser.add_argument('-v', '--version', type=int, default=0x0003,
79-
help='an integer for the version to be used')
79+
help='hexadecimal version number of ggml file')
8080
parser.add_argument('-a', '--author', type=str, default='TheBloke',
81-
help='an author to be filtered')
82-
parser.add_argument('-t', '--tags', type=str, default='llama',
83-
help='tags for the content')
81+
help='HuggingFace author filter')
82+
parser.add_argument('-t', '--tag', type=str, default='llama',
83+
help='HuggingFace tag filter')
8484
parser.add_argument('-s', '--search', type=str, default='',
85-
help='search term')
85+
help='HuggingFace search filter')
86+
parser.add_argument('-f', '--filename', type=str, default='q5_1',
87+
help='HuggingFace model repository filename substring match')
8688

8789
# Parse the arguments
8890
args = parser.parse_args()
8991

9092
# Define the parameters
9193
params = {
9294
"author": args.author,
93-
"tags": args.tags,
95+
"tags": args.tag,
9496
"search": args.search
9597
}
9698

@@ -108,25 +110,30 @@ def main():
108110

109111
for sibling in model_info.get('siblings', []):
110112
rfilename = sibling.get('rfilename')
111-
if rfilename and 'q5_1' in rfilename:
113+
if rfilename and args.filename in rfilename:
112114
model_list.append((model_id, rfilename))
113115

114116
# Choose the model
115-
if len(model_list) == 1:
117+
model_list.sort(key=lambda x: x[0])
118+
if len(model_list) == 0:
119+
print("No models found")
120+
exit(1)
121+
elif len(model_list) == 1:
116122
model_choice = model_list[0]
117123
else:
118124
model_choice = get_user_choice(model_list)
119125

120126
if model_choice is not None:
121127
model_id, rfilename = model_choice
122128
url = f"https://huggingface.co/{model_id}/resolve/main/{rfilename}"
123-
download_file(url, rfilename)
124-
_, version = check_magic_and_version(rfilename)
129+
dest = f"{model_id.replace('/', '_')}_{rfilename}"
130+
download_file(url, dest)
131+
_, version = check_magic_and_version(dest)
125132
if version != args.version:
126133
print(f"Warning: Expected version {args.version}, but found different version in the file.")
127134
else:
128135
print("Error - model choice was None")
129-
exit(1)
136+
exit(2)
130137

131138
if __name__ == '__main__':
132139
main()

‎docker/open_llama/start.sh

Copy file name to clipboard
+28Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#!/bin/sh
2+
3+
MODEL="open_llama_3b"
4+
5+
# Start Docker container
6+
docker run --cap-add SYS_RESOURCE -p 8000:8000 -t $MODEL &
7+
sleep 10
8+
echo
9+
docker ps | egrep "(^CONTAINER|$MODEL)"
10+
11+
# Test the model works
12+
echo
13+
curl -X 'POST' 'http://localhost:8000/v1/completions' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{
14+
"prompt": "\n\n### Instructions:\nWhat is the capital of France?\n\n### Response:\n",
15+
"stop": [
16+
"\n",
17+
"###"
18+
]
19+
}' | grep Paris
20+
if [ $? -eq 0 ]
21+
then
22+
echo
23+
echo "$MODEL is working!!"
24+
else
25+
echo
26+
echo "ERROR: $MODEL not replying."
27+
exit 1
28+
fi

‎docker/auto_docker/start_server.sh renamed to ‎docker/open_llama/start_server.sh

Copy file name to clipboardExpand all lines: docker/open_llama/start_server.sh
+1-1Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/bin/sh
22

3-
# For mmap support
3+
# For mlock support
44
ulimit -l unlimited
55

66
if [ "$IMAGE" = "python:3-slim-bullseye" ]; then

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.