Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 247a16d

Browse filesBrowse files
committed
docs: Update README
1 parent 13b7ced commit 247a16d
Copy full SHA for 247a16d

File tree

Expand file treeCollapse file tree

1 file changed

+28
-19
lines changed
Filter options
Expand file treeCollapse file tree

1 file changed

+28
-19
lines changed

‎README.md

Copy file name to clipboardExpand all lines: README.md
+28-19Lines changed: 28 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -12,20 +12,17 @@ This package provides:
1212

1313
- Low-level access to C API via `ctypes` interface.
1414
- High-level Python API for text completion
15-
- OpenAI-like API
16-
- [LangChain compatibility](https://python.langchain.com/docs/integrations/llms/llamacpp)
17-
- [LlamaIndex compatibility](https://docs.llamaindex.ai/en/stable/examples/llm/llama_2_llama_cpp.html)
15+
- OpenAI-like API
16+
- [LangChain compatibility](https://python.langchain.com/docs/integrations/llms/llamacpp)
17+
- [LlamaIndex compatibility](https://docs.llamaindex.ai/en/stable/examples/llm/llama_2_llama_cpp.html)
1818
- OpenAI compatible web server
19-
- [Local Copilot replacement](https://llama-cpp-python.readthedocs.io/en/latest/server/#code-completion)
20-
- [Function Calling support](https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling)
21-
- [Vision API support](https://llama-cpp-python.readthedocs.io/en/latest/server/#multimodal-models)
22-
- [Multiple Models](https://llama-cpp-python.readthedocs.io/en/latest/server/#configuration-and-multi-model-support)
19+
- [Local Copilot replacement](https://llama-cpp-python.readthedocs.io/en/latest/server/#code-completion)
20+
- [Function Calling support](https://llama-cpp-python.readthedocs.io/en/latest/server/#function-calling)
21+
- [Vision API support](https://llama-cpp-python.readthedocs.io/en/latest/server/#multimodal-models)
22+
- [Multiple Models](https://llama-cpp-python.readthedocs.io/en/latest/server/#configuration-and-multi-model-support)
2323

2424
Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest](https://llama-cpp-python.readthedocs.io/en/latest).
2525

26-
27-
28-
2926
## Installation
3027

3128
`llama-cpp-python` can be installed directly from PyPI as a source distribution by running:
@@ -38,7 +35,6 @@ This will build `llama.cpp` from source using cmake and your system's c compiler
3835

3936
If you run into issues during installation add the `--verbose` flag to the `pip install` command to see the full cmake build log.
4037

41-
4238
### Installation with Specific Hardware Acceleration (BLAS, CUDA, Metal, etc)
4339

4440
The default pip install behaviour is to build `llama.cpp` for CPU only on Linux and Windows and use Metal on MacOS.
@@ -109,13 +105,29 @@ To install with Vulkan support, set the `LLAMA_VULKAN=on` environment variable b
109105
CMAKE_ARGS="-DLLAMA_VULKAN=on" pip install llama-cpp-python
110106
```
111107

108+
#### Kompute
109+
110+
To install with Kompute support, set the `LLAMA_KOMPUTE=on` environment variable before installing:
111+
112+
```bash
113+
CMAKE_ARGS="-DLLAMA_KOMPUTE=on" pip install llama-cpp-python
114+
```
115+
116+
#### SYCL
117+
118+
To install with SYCL support, set the `LLAMA_SYCL=on` environment variable before installing:
119+
120+
```bash
121+
CMAKE_ARGS="-DLLAMA_SYCL=on" pip install llama-cpp-python
122+
```
123+
112124
### Windows Notes
113125

114126
If you run into issues where it complains it can't find `'nmake'` `'?'` or CMAKE_C_COMPILER, you can extract w64devkit as [mentioned in llama.cpp repo](https://github.com/ggerganov/llama.cpp#openblas) and add those manually to CMAKE_ARGS before running `pip` install:
115127

116128
```ps
117129
$env:CMAKE_GENERATOR = "MinGW Makefiles"
118-
$env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on -DCMAKE_C_COMPILER=C:/w64devkit/bin/gcc.exe -DCMAKE_CXX_COMPILER=C:/w64devkit/bin/g++.exe"
130+
$env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on -DCMAKE_C_COMPILER=C:/w64devkit/bin/gcc.exe -DCMAKE_CXX_COMPILER=C:/w64devkit/bin/g++.exe"
119131
```
120132

121133
See the above instructions and set `CMAKE_ARGS` to the BLAS backend you want to use.
@@ -165,7 +177,7 @@ Below is a short example demonstrating how to use the high-level API to for basi
165177
>>> from llama_cpp import Llama
166178
>>> llm = Llama(
167179
model_path="./models/7B/llama-model.gguf",
168-
# n_gpu_layers=-1, # Uncomment to use GPU acceleration
180+
# n_gpu_layers=-1, # Uncomment to use GPU acceleration
169181
# seed=1337, # Uncomment to set a specific seed
170182
# n_ctx=2048, # Uncomment to increase the context window
171183
)
@@ -284,7 +296,6 @@ The high-level API also provides a simple interface for function calling.
284296
Note that the only model that supports full function calling at this time is "functionary".
285297
The gguf-converted files for this model can be found here: [functionary-7b-v1](https://huggingface.co/abetlen/functionary-7b-v1-GGUF)
286298

287-
288299
```python
289300
>>> from llama_cpp import Llama
290301
>>> llm = Llama(model_path="path/to/functionary/llama-model.gguf", chat_format="functionary")
@@ -293,7 +304,7 @@ The gguf-converted files for this model can be found here: [functionary-7b-v1](h
293304
{
294305
"role": "system",
295306
"content": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary"
296-
307+
297308
},
298309
{
299310
"role": "user",
@@ -332,7 +343,6 @@ The gguf-converted files for this model can be found here: [functionary-7b-v1](h
332343

333344
### Multi-modal Models
334345

335-
336346
`llama-cpp-python` supports the llava1.5 family of multi-modal models which allow the language model to
337347
read information from both text and images.
338348

@@ -378,7 +388,6 @@ For instance, if you want to work with larger contexts, you can expand the conte
378388
llm = Llama(model_path="./models/7B/llama-model.gguf", n_ctx=2048)
379389
```
380390

381-
382391
## OpenAI Compatible Web Server
383392

384393
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
@@ -426,7 +435,8 @@ A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python).
426435
```bash
427436
docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/llama-model.gguf ghcr.io/abetlen/llama-cpp-python:latest
428437
```
429-
[Docker on termux (requires root)](https://gist.github.com/FreddieOliveira/efe850df7ff3951cb62d74bd770dce27) is currently the only known way to run this on phones, see [termux support issue](https://github.com/abetlen/llama-cpp-python/issues/389)
438+
439+
[Docker on termux (requires root)](https://gist.github.com/FreddieOliveira/efe850df7ff3951cb62d74bd770dce27) is currently the only known way to run this on phones, see [termux support issue](https://github.com/abetlen/llama-cpp-python/issues/389)
430440

431441
## Low-level API
432442

@@ -454,7 +464,6 @@ Below is a short example demonstrating how to use the low-level API to tokenize
454464

455465
Check out the [examples folder](examples/low_level_api) for more examples of using the low-level API.
456466

457-
458467
## Documentation
459468

460469
Documentation is available via [https://llama-cpp-python.readthedocs.io/](https://llama-cpp-python.readthedocs.io/).

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.