You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docker/README.md
+11-5Lines changed: 11 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,9 @@
1
+
# Dockerfiles for building the llama-cpp-python server
2
+
-`Dockerfile.openblas_simple` - a simple Dockerfile for non-GPU OpenBLAS
3
+
-`Dockerfile.cuda_simple` - a simple Dockerfile for CUDA accelerated CuBLAS
4
+
-`hug_model.py` - a Python utility for interactively choosing and downloading the latest `5_1` quantized models from [huggingface.co/TheBloke](https://huggingface.co/TheBloke)
5
+
-`Dockerfile` - a single OpenBLAS and CuBLAS combined Dockerfile that automatically installs a previously downloaded model `model.bin`
6
+
1
7
# Get model from Hugging Face
2
8
`python3 ./hug_model.py`
3
9
@@ -7,7 +13,7 @@ docker $ ls -lh *.bin
7
13
-rw-rw-r-- 1 user user 4.8G May 23 18:30 <downloaded-model-file>.q5_1.bin
8
14
lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>.q5_1.bin
9
15
```
10
-
**Note #1:** Make sure you have enough disk space to d/l the model. As the model is then copied into the image you will need at least
16
+
**Note #1:** Make sure you have enough disk space to download the model. As the model is then copied into the image you will need at least
11
17
**TWICE** as much disk space as the size of the model:
12
18
13
19
| Model | Quantized size |
@@ -21,20 +27,20 @@ lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>.q5
21
27
22
28
# Install Docker Server
23
29
24
-
**Note #3:** This was tested with Docker running on Linux. If you can get it working on Windows or MacOS, please update this README with a PR!
30
+
**Note #3:** This was tested with Docker running on Linux. If you can get it working on Windows or MacOS, please update this `README.md` with a PR!
No NVidia GPU, defaults to `python:3-slim-bullseye` Docker base image and OpenBlAS:
35
+
Use if you don't have a NVidia GPU. Defaults to `python:3-slim-bullseye` Docker base image and OpenBLAS:
30
36
## Build:
31
37
`docker build --build-arg -t openblas .`
32
38
## Run:
33
39
`docker run --cap-add SYS_RESOURCE -t openblas`
34
40
35
41
# Use CuBLAS
36
-
Requires NVidia GPU and Docker NVidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html))
42
+
Requires a NVidia GPU with sufficient VRAM (approximately as much as the size above) and Docker NVidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html))
0 commit comments