Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 5a9770a

Browse filesBrowse files
authored
Improve documentation for server chat formats (abetlen#934)
1 parent b8f29f4 commit 5a9770a
Copy full SHA for 5a9770a

File tree

Expand file treeCollapse file tree

1 file changed

+9
-0
lines changed
Filter options
Expand file treeCollapse file tree

1 file changed

+9
-0
lines changed

‎README.md

Copy file name to clipboardExpand all lines: README.md
+9Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,15 @@ Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the
177177
To bind to `0.0.0.0` to enable remote connections, use `python3 -m llama_cpp.server --host 0.0.0.0`.
178178
Similarly, to change the port (default is 8000), use `--port`.
179179

180+
You probably also want to set the prompt format. For chatml, use
181+
182+
```bash
183+
python3 -m llama_cpp.server --model models/7B/llama-model.gguf --chat_format chatml
184+
```
185+
186+
That will format the prompt according to how model expects it. You can find the prompt format in the model card.
187+
For possible options, see [llama_cpp/llama_chat_format.py](llama_cpp/llama_chat_format.py) and look for lines starting with "@register_chat_format".
188+
180189
## Docker image
181190

182191
A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python). To run the server:

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.