Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 399fa1e

Browse filesBrowse files
committed
docs: Add JSON and JSON schema mode examples to README
1 parent c1d0fff commit 399fa1e
Copy full SHA for 399fa1e

File tree

Expand file treeCollapse file tree

1 file changed

+53
-0
lines changed
Filter options
Expand file treeCollapse file tree

1 file changed

+53
-0
lines changed

‎README.md

Copy file name to clipboardExpand all lines: README.md
+53Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,59 @@ Note that `chat_format` option must be set for the particular model you are usin
216216

217217
Chat completion is available through the [`create_chat_completion`](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#llama_cpp.Llama.create_chat_completion) method of the [`Llama`](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#llama_cpp.Llama) class.
218218

219+
### JSON and JSON Schema Mode
220+
221+
If you want to constrain chat responses to only valid JSON or a specific JSON Schema you can use the `response_format` argument to the `create_chat_completion` method.
222+
223+
#### Json Mode
224+
225+
The following example will constrain the response to be valid JSON.
226+
227+
```python
228+
>>> from llama_cpp import Llama
229+
>>> llm = Llama(model_path="path/to/model.gguf", chat_format="chatml")
230+
>>> llm.create_chat_completion(
231+
messages=[
232+
{
233+
"role": "system",
234+
"content": "You are a helpful assistant that outputs in JSON.",
235+
},
236+
{"role": "user", "content": "Who won the world series in 2020"},
237+
],
238+
response_format={
239+
"type": "json_object",
240+
},
241+
temperature=0.7,
242+
)
243+
```
244+
245+
#### Json Mode
246+
247+
To constrain the response to a specific JSON Schema, you can use the `schema` property of the `response_format` argument.
248+
249+
```python
250+
>>> from llama_cpp import Llama
251+
>>> llm = Llama(model_path="path/to/model.gguf", chat_format="chatml")
252+
>>> llm.create_chat_completion(
253+
messages=[
254+
{
255+
"role": "system",
256+
"content": "You are a helpful assistant that outputs in JSON.",
257+
},
258+
{"role": "user", "content": "Who won the world series in 2020"},
259+
],
260+
response_format={
261+
"type": "json_object",
262+
"schema": {
263+
"type": "object",
264+
"properties": {"team_name": {"type": "string"}},
265+
"required": ["team_name"],
266+
},
267+
},
268+
temperature=0.7,
269+
)
270+
```
271+
219272
### Function Calling
220273

221274
The high-level API also provides a simple interface for function calling.

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.