Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 8f35bdd

Browse filesBrowse files
committed
Fix stop sequence performance bug.
1 parent 00ea3af commit 8f35bdd
Copy full SHA for 8f35bdd

File tree

Expand file treeCollapse file tree

2 files changed

+11
-5
lines changed
Filter options
Expand file treeCollapse file tree

2 files changed

+11
-5
lines changed

‎CHANGELOG.md

Copy file name to clipboardExpand all lines: CHANGELOG.md
+5-1Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12-
- Added first version of the changelog
12+
- Added first version of the changelog
13+
14+
### Fixed
15+
16+
- Performance bug in stop sequence check slowing down streaming.

‎llama_cpp/llama.py

Copy file name to clipboardExpand all lines: llama_cpp/llama.py
+6-4Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -775,20 +775,22 @@ def _create_completion(
775775
break
776776

777777
if stream:
778+
remaining_tokens = completion_tokens[returned_tokens:]
779+
remaining_text = self.detokenize(remaining_tokens)
780+
remaining_length = len(remaining_text)
781+
778782
# We want to avoid yielding any characters from
779783
# the generated text if they are part of a stop
780784
# sequence.
781785
first_stop_position = 0
782786
for s in stop_sequences:
783-
for i in range(len(s), 0, -1):
784-
if all_text.endswith(s[:i]):
787+
for i in range(min(len(s), remaining_length), 0, -1):
788+
if remaining_text.endswith(s[:i]):
785789
if i > first_stop_position:
786790
first_stop_position = i
787791
break
788792

789793
token_end_position = 0
790-
remaining_tokens = completion_tokens[returned_tokens:]
791-
remaining_length = len(self.detokenize(remaining_tokens))
792794
for token in remaining_tokens:
793795
token_end_position += len(self.detokenize([token]))
794796
# Check if stop sequence is in the token

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.