Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 217d783

Browse filesBrowse files
committed
Added paramterised search and d/l for Hugging Face. Updated README.md
1 parent 483b6ba commit 217d783
Copy full SHA for 217d783

File tree

Expand file treeCollapse file tree

3 files changed

+47
-27
lines changed
Filter options
Expand file treeCollapse file tree

3 files changed

+47
-27
lines changed

‎.gitignore

Copy file name to clipboardExpand all lines: .gitignore
+3Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,3 +164,6 @@ cython_debug/
164164
# and can be added to the global gitignore or merged into this file. For a more nuclear
165165
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
166166
.idea/
167+
168+
# model .bin files
169+
docker/auto_docker/*.bin

‎docker/README.md

Copy file name to clipboard
+21-20Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
# Install Docker Server
2+
3+
**Note #1:** This was tested with Docker running on Linux. If you can get it working on Windows or MacOS, please update this `README.md` with a PR!
4+
5+
[Install Docker Engine](https://docs.docker.com/engine/install)
6+
7+
**Note #2:** NVidia GPU CuBLAS support requires a NVidia GPU with sufficient VRAM (approximately as much as the size above) and Docker NVidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html))
8+
19
# Simple Dockerfiles for building the llama-cpp-python server with external model bin files
210
- `./openblas_simple/Dockerfile` - a simple Dockerfile for non-GPU OpenBLAS, where the model is located outside the Docker image
311
- `cd ./openblas_simple`
@@ -15,14 +23,14 @@
1523
- `hug_model.py` - a Python utility for interactively choosing and downloading the latest `5_1` quantized models from [huggingface.co/TheBloke]( https://huggingface.co/TheBloke)
1624
- `Dockerfile` - a single OpenBLAS and CuBLAS combined Dockerfile that automatically installs a previously downloaded model `model.bin`
1725

18-
## Get model from Hugging Face
19-
`python3 ./hug_model.py`
20-
21-
You should now have a model in the current directory and `model.bin` symlinked to it for the subsequent Docker build and copy step. e.g.
26+
## Download a Llama Model from Hugging Face
27+
- To download a MIT licensed Llama model run: `python3 ./hug_model.py -a vihangd -s open_llama_7b_700bt_ggml`
28+
- To select and install a restricted license Llama model run: `python3 ./hug_model.py -a TheBloke -t llama`
29+
- You should now have a model in the current directory and `model.bin` symlinked to it for the subsequent Docker build and copy step. e.g.
2230
```
2331
docker $ ls -lh *.bin
24-
-rw-rw-r-- 1 user user 4.8G May 23 18:30 <downloaded-model-file>.q5_1.bin
25-
lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>.q5_1.bin
32+
-rw-rw-r-- 1 user user 4.8G May 23 18:30 <downloaded-model-file>q5_1.bin
33+
lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>q5_1.bin
2634
```
2735
**Note #1:** Make sure you have enough disk space to download the model. As the model is then copied into the image you will need at least
2836
**TWICE** as much disk space as the size of the model:
@@ -36,22 +44,15 @@ lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>.q5
3644

3745
**Note #2:** If you want to pass or tune additional parameters, customise `./start_server.sh` before running `docker build ...`
3846

39-
# Install Docker Server
40-
41-
**Note #3:** This was tested with Docker running on Linux. If you can get it working on Windows or MacOS, please update this `README.md` with a PR!
42-
43-
[Install Docker Engine](https://docs.docker.com/engine/install)
44-
45-
# Use OpenBLAS
47+
## Use OpenBLAS
4648
Use if you don't have a NVidia GPU. Defaults to `python:3-slim-bullseye` Docker base image and OpenBLAS:
47-
## Build:
48-
`docker build --build-arg -t openblas .`
49-
## Run:
49+
### Build:
50+
`docker build -t openblas .`
51+
### Run:
5052
`docker run --cap-add SYS_RESOURCE -t openblas`
5153

52-
# Use CuBLAS
53-
Requires a NVidia GPU with sufficient VRAM (approximately as much as the size above) and Docker NVidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html))
54-
## Build:
54+
## Use CuBLAS
55+
### Build:
5556
`docker build --build-arg IMAGE=nvidia/cuda:12.1.1-devel-ubuntu22.04 -t cublas .`
56-
## Run:
57+
### Run:
5758
`docker run --cap-add SYS_RESOURCE -t cublas`

‎docker/auto_docker/hug_model.py

Copy file name to clipboardExpand all lines: docker/auto_docker/hug_model.py
+23-7Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import json
33
import os
44
import struct
5+
import argparse
56

67
def make_request(url, params=None):
78
print(f"Making request to {url}...")
@@ -69,21 +70,28 @@ def get_user_choice(model_list):
6970

7071
return None
7172

72-
import argparse
73-
7473
def main():
7574
# Create an argument parser
76-
parser = argparse.ArgumentParser(description='Process the model version.')
75+
parser = argparse.ArgumentParser(description='Process some parameters.')
76+
77+
# Arguments
7778
parser.add_argument('-v', '--version', type=int, default=0x0003,
7879
help='an integer for the version to be used')
80+
parser.add_argument('-a', '--author', type=str, default='TheBloke',
81+
help='an author to be filtered')
82+
parser.add_argument('-t', '--tags', type=str, default='llama',
83+
help='tags for the content')
84+
parser.add_argument('-s', '--search', type=str, default='',
85+
help='search term')
7986

8087
# Parse the arguments
8188
args = parser.parse_args()
8289

8390
# Define the parameters
8491
params = {
85-
"author": "TheBloke", # Filter by author
86-
"tags": "llama"
92+
"author": args.author,
93+
"tags": args.tags,
94+
"search": args.search
8795
}
8896

8997
models = make_request('https://huggingface.co/api/models', params=params)
@@ -103,14 +111,22 @@ def main():
103111
if rfilename and 'q5_1' in rfilename:
104112
model_list.append((model_id, rfilename))
105113

106-
model_choice = get_user_choice(model_list)
114+
# Choose the model
115+
if len(model_list) == 1:
116+
model_choice = model_list[0]
117+
else:
118+
model_choice = get_user_choice(model_list)
119+
107120
if model_choice is not None:
108121
model_id, rfilename = model_choice
109122
url = f"https://huggingface.co/{model_id}/resolve/main/{rfilename}"
110123
download_file(url, rfilename)
111124
_, version = check_magic_and_version(rfilename)
112125
if version != args.version:
113-
print(f"Warning: Expected version {args.version}, but found different version in the file.")
126+
print(f"Warning: Expected version {args.version}, but found different version in the file.")
127+
else:
128+
print("Error - model choice was None")
129+
exit(1)
114130

115131
if __name__ == '__main__':
116132
main()

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.