Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 66bcb8d

Browse filesBrowse files
committed
Merge branch 'main' into add-numpy-support
2 parents 7fc7bc3 + 8f35bdd commit 66bcb8d
Copy full SHA for 66bcb8d

File tree

Expand file treeCollapse file tree

7 files changed

+448
-154
lines changed
Filter options
Expand file treeCollapse file tree

7 files changed

+448
-154
lines changed

‎CHANGELOG.md

Copy file name to clipboardExpand all lines: CHANGELOG.md
+5-1Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12-
- Added first version of the changelog
12+
- Added first version of the changelog
13+
14+
### Fixed
15+
16+
- Performance bug in stop sequence check slowing down streaming.

‎Makefile

Copy file name to clipboard
+49Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
update:
2+
poetry install
3+
git submodule update --init --recursive
4+
5+
update.vendor:
6+
cd vendor/llama.cpp && git pull origin master
7+
8+
build:
9+
python3 setup.py develop
10+
11+
build.cuda:
12+
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 python3 setup.py develop
13+
14+
build.opencl:
15+
CMAKE_ARGS="-DLLAMA_CLBLAST=on" FORCE_CMAKE=1 python3 setup.py develop
16+
17+
build.openblas:
18+
CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 python3 setup.py develop
19+
20+
build.blis:
21+
CMAKE_ARGS="-DLLAMA_OPENBLAS=on -DLLAMA_OPENBLAS_VENDOR=blis" FORCE_CMAKE=1 python3 setup.py develop
22+
23+
build.sdist:
24+
python3 setup.py sdist
25+
26+
deploy.pypi:
27+
python3 -m twine upload dist/*
28+
29+
deploy.gh-docs:
30+
mkdocs build
31+
mkdocs gh-deploy
32+
33+
clean:
34+
- cd vendor/llama.cpp && make clean
35+
- cd vendor/llama.cpp && rm libllama.so
36+
- rm -rf _skbuild
37+
- rm llama_cpp/libllama.so
38+
39+
.PHONY: \
40+
update \
41+
update.vendor \
42+
build \
43+
build.cuda \
44+
build.opencl \
45+
build.openblas \
46+
build.sdist \
47+
deploy.pypi \
48+
deploy.gh-docs \
49+
clean

‎README.md

Copy file name to clipboardExpand all lines: README.md
+11Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,17 @@ To get started, clone the repository and install the package in development mode
155155

156156
```bash
157157
git clone --recurse-submodules git@github.com:abetlen/llama-cpp-python.git
158+
159+
# Install with pip
160+
pip install -e .
161+
162+
# if you want to use the fastapi / openapi server
163+
pip install -e .[server]
164+
165+
# If you're a poetry user, installing will also include a virtual environment
166+
poetry install --all-extras
167+
. .venv/bin/activate
168+
158169
# Will need to be re-run any time vendor/llama.cpp is updated
159170
python3 setup.py develop
160171
```

‎llama_cpp/llama.py

Copy file name to clipboardExpand all lines: llama_cpp/llama.py
+6-4Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -795,20 +795,22 @@ def _create_completion(
795795
break
796796

797797
if stream:
798+
remaining_tokens = completion_tokens[returned_tokens:]
799+
remaining_text = self.detokenize(remaining_tokens)
800+
remaining_length = len(remaining_text)
801+
798802
# We want to avoid yielding any characters from
799803
# the generated text if they are part of a stop
800804
# sequence.
801805
first_stop_position = 0
802806
for s in stop_sequences:
803-
for i in range(len(s), 0, -1):
804-
if all_text.endswith(s[:i]):
807+
for i in range(min(len(s), remaining_length), 0, -1):
808+
if remaining_text.endswith(s[:i]):
805809
if i > first_stop_position:
806810
first_stop_position = i
807811
break
808812

809813
token_end_position = 0
810-
remaining_tokens = completion_tokens[returned_tokens:]
811-
remaining_length = len(self.detokenize(remaining_tokens))
812814
for token in remaining_tokens:
813815
token_end_position += len(self.detokenize([token]))
814816
# Check if stop sequence is in the token

‎poetry.lock

Copy file name to clipboardExpand all lines: poetry.lock
+367-148Lines changed: 367 additions & 148 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

‎poetry.toml

Copy file name to clipboard
+3Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[virtualenvs]
2+
in-project = true
3+
prefer-active-python = true

‎pyproject.toml

Copy file name to clipboardExpand all lines: pyproject.toml
+7-1Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@ include = [
1515
[tool.poetry.dependencies]
1616
python = "^3.8.1"
1717
typing-extensions = "^4.5.0"
18-
18+
uvicorn = { version = "^0.21.1", optional = true }
19+
fastapi = { version = "^0.95.0", optional = true }
20+
sse-starlette = { version = "^1.3.3", optional = true }
1921

2022
[tool.poetry.group.dev.dependencies]
2123
black = "^23.3.0"
@@ -25,6 +27,10 @@ mkdocstrings = {extras = ["python"], version = "^0.21.2"}
2527
mkdocs-material = "^9.1.14"
2628
pytest = "^7.3.1"
2729
httpx = "^0.24.1"
30+
scikit-build = "0.13"
31+
32+
[tool.poetry.extras]
33+
server = ["uvicorn", "fastapi", "sse-starlette"]
2834

2935
[build-system]
3036
requires = [

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.