Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit ab2cab5

Browse filesBrowse files
committed
Update README.md
1 parent bb3b70b commit ab2cab5
Copy full SHA for ab2cab5

File tree

Expand file treeCollapse file tree

1 file changed

+21
-94
lines changed
Filter options
Expand file treeCollapse file tree

1 file changed

+21
-94
lines changed

‎README.md

Copy file name to clipboardExpand all lines: README.md
+21-94Lines changed: 21 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -1,87 +1,26 @@
1-
# 🦙 Python Bindings for `llama.cpp`
1+
# Python Bindings for `ggllm.cpp`
22

3-
[![Documentation Status](https://readthedocs.org/projects/llama-cpp-python/badge/?version=latest)](https://llama-cpp-python.readthedocs.io/en/latest/?badge=latest)
4-
[![Tests](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml)
5-
[![PyPI](https://img.shields.io/pypi/v/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
6-
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
7-
[![PyPI - License](https://img.shields.io/pypi/l/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
8-
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)
93

10-
Simple Python bindings for **@ggerganov's** [`llama.cpp`](https://github.com/ggerganov/llama.cpp) library.
4+
Simple Python bindings for [`ggllm.cpp`](https://github.com/cmp-nct/ggllm.cpp) library.
115
This package provides:
126

137
- Low-level access to C API via `ctypes` interface.
148
- High-level Python API for text completion
159
- OpenAI-like API
1610
- LangChain compatibility
1711

18-
Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest](https://llama-cpp-python.readthedocs.io/en/latest).
12+
This project is currently in alpha development and is not yet completely functional. Any contributions are warmly welcomed.
1913

2014

21-
## Installation from PyPI (recommended)
22-
23-
Install from PyPI (requires a c compiler):
24-
25-
```bash
26-
pip install llama-cpp-python
27-
```
28-
29-
The above command will attempt to install the package and build `llama.cpp` from source.
30-
This is the recommended installation method as it ensures that `llama.cpp` is built with the available optimizations for your system.
31-
32-
If you have previously installed `llama-cpp-python` through pip and want to upgrade your version or rebuild the package with different compiler options, please add the following flags to ensure that the package is rebuilt correctly:
33-
34-
```bash
35-
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
36-
```
37-
38-
Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:
39-
```
40-
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
41-
bash Miniforge3-MacOSX-arm64.sh
42-
```
43-
Otherwise, while installing it will build the llama.ccp x86 version which will be 10x slower on Apple Silicon (M1) Mac.
44-
45-
### Installation with OpenBLAS / cuBLAS / CLBlast / Metal
46-
47-
`llama.cpp` supports multiple BLAS backends for faster processing.
48-
Use the `FORCE_CMAKE=1` environment variable to force the use of `cmake` and install the pip package for the desired BLAS backend.
49-
50-
To install with OpenBLAS, set the `LLAMA_OPENBLAS=1` environment variable before installing:
51-
52-
```bash
53-
CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
54-
```
55-
56-
To install with cuBLAS, set the `LLAMA_CUBLAS=1` environment variable before installing:
57-
58-
```bash
59-
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
60-
```
61-
62-
To install with CLBlast, set the `LLAMA_CLBLAST=1` environment variable before installing:
63-
64-
```bash
65-
CMAKE_ARGS="-DLLAMA_CLBLAST=on" FORCE_CMAKE=1 pip install llama-cpp-python
66-
```
67-
68-
To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing:
69-
70-
```bash
71-
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python
72-
```
73-
74-
Detailed MacOS Metal GPU install documentation is available at [docs/install/macos.md](docs/install/macos.md)
75-
7615
## High-level API
7716

7817
The high-level API provides a simple managed interface through the `Llama` class.
7918

8019
Below is a short example demonstrating how to use the high-level API to generate text:
8120

8221
```python
83-
>>> from llama_cpp import Llama
84-
>>> llm = Llama(model_path="./models/7B/ggml-model.bin")
22+
>>> from falcon_cpp import Falcon
23+
>>> llm = Falcon(model_path="./models/7B/ggml-model.bin")
8524
>>> output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
8625
>>> print(output)
8726
{
@@ -107,57 +46,45 @@ Below is a short example demonstrating how to use the high-level API to generate
10746

10847
## Web Server
10948

110-
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
111-
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
49+
`falcon-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
50+
This allows you to use ggllm.cpp to inference falcon models with any OpenAI compatible client (language libraries, services, etc).
11251

11352
To install the server package and get started:
11453

11554
```bash
116-
pip install llama-cpp-python[server]
11755
python3 -m llama_cpp.server --model models/7B/ggml-model.bin
11856
```
11957

12058
Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.
12159

122-
## Docker image
123-
124-
A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python). To run the server:
125-
126-
```bash
127-
docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/ggml-model-name.bin ghcr.io/abetlen/llama-cpp-python:latest
128-
```
129-
13060
## Low-level API
13161

13262
The low-level API is a direct [`ctypes`](https://docs.python.org/3/library/ctypes.html) binding to the C API provided by `llama.cpp`.
133-
The entire lowe-level API can be found in [llama_cpp/llama_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/master/llama_cpp/llama_cpp.py) and directly mirrors the C API in [llama.h](https://github.com/ggerganov/llama.cpp/blob/master/llama.h).
63+
The entire lowe-level API can be found in [falcon_cpp/falcon_cpp.py](https://github.com/sirajperson/falcon-cpp-python/blob/master/falcon_cpp/falcon_cpp.py) and directly mirrors the C API in [libfalcon.h](https://github.com/cmp-nct/ggllm.cpp/blob/master/libfalcon.h).
13464

13565
Below is a short example demonstrating how to use the low-level API to tokenize a prompt:
13666

13767
```python
138-
>>> import llama_cpp
68+
>>> import falcon_cpp
13969
>>> import ctypes
140-
>>> params = llama_cpp.llama_context_default_params()
70+
>>> params = falcon_cpp.falcon_context_default_params()
14171
# use bytes for char * params
142-
>>> ctx = llama_cpp.llama_init_from_file(b"./models/7b/ggml-model.bin", params)
72+
>>> ctx = falcon_cpp.falcon_init_backend("./models/7b/ggml-model.bin", params)
14373
>>> max_tokens = params.n_ctx
14474
# use ctypes arrays for array params
145-
>>> tokens = (llama_cpp.llama_token * int(max_tokens))()
146-
>>> n_tokens = llama_cpp.llama_tokenize(ctx, b"Q: Name the planets in the solar system? A: ", tokens, max_tokens, add_bos=llama_cpp.c_bool(True))
147-
>>> llama_cpp.llama_free(ctx)
75+
>>> tokens = (falcon_cpp.falcon_token * int(max_tokens))()
76+
>>> n_tokens = falcon_cpp.falcon_tokenize(ctx, b"Q: Name the planets in the solar system? A: ", tokens, max_tokens, add_bos=llama_cpp.c_bool(True))
77+
>>> falcon_cpp.falcon_free(ctx)
14878
```
14979

15080
Check out the [examples folder](examples/low_level_api) for more examples of using the low-level API.
15181

152-
15382
# Documentation
154-
155-
Documentation is available at [https://abetlen.github.io/llama-cpp-python](https://abetlen.github.io/llama-cpp-python).
156-
If you find any issues with the documentation, please open an issue or submit a PR.
83+
Coming soon...
15784

15885
# Development
15986

160-
This package is under active development and I welcome any contributions.
87+
Again, this package is under active development and I welcome any contributions.
16188

16289
To get started, clone the repository and install the package in development mode:
16390

@@ -179,12 +106,12 @@ poetry install --all-extras
179106
python3 setup.py develop
180107
```
181108

182-
# How does this compare to other Python bindings of `llama.cpp`?
183-
184-
I originally wrote this package for my own use with two goals in mind:
109+
# This Project is a fork of llama-cpp-python
185110

186-
- Provide a simple process to install `llama.cpp` and access the full C API in `llama.h` from Python
187-
- Provide a high-level Python API that can be used as a drop-in replacement for the OpenAI API so existing apps can be easily ported to use `llama.cpp`
111+
This project was originally llama-cpp-python and owes an immense thanks to @abetlen.
112+
This projects goal is to
113+
- Provide a simple process to install `ggllm.cpp` and access the full C API in `libfalcon.h` from Python
114+
- Provide a high-level Python API that can be used as a drop-in replacement for the OpenAI API so existing apps can be easily ported to use `ggllm.cpp`
188115

189116
Any contributions and changes to this package will be made with these goals in mind.
190117

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.