Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit ef898f1

Browse filesBrowse files
ochafikngxsonggerganov
authored andcommitted
Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (ggml-org#9639)
--------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
1 parent d9fb05c commit ef898f1
Copy full SHA for ef898f1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Dismiss banner

48 files changed

+3861
-156
lines changed

‎.editorconfig

Copy file name to clipboardExpand all lines: .editorconfig
+8
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,11 @@ indent_style = tab
4040
[examples/cvector-generator/*.txt]
4141
trim_trailing_whitespace = unset
4242
insert_final_newline = unset
43+
44+
[models/templates/*.jinja]
45+
indent_style = unset
46+
indent_size = unset
47+
end_of_line = unset
48+
charset = unset
49+
trim_trailing_whitespace = unset
50+
insert_final_newline = unset

‎.github/workflows/server.yml

Copy file name to clipboardExpand all lines: .github/workflows/server.yml
+1-1
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,7 @@ jobs:
205205
run: |
206206
cd examples/server/tests
207207
$env:PYTHONIOENCODING = ":replace"
208-
pytest -v -x
208+
pytest -v -x -m "not slow"
209209
210210
- name: Slow tests
211211
id: server_integration_tests_slow

‎Makefile

Copy file name to clipboardExpand all lines: Makefile
+9
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ TEST_TARGETS = \
5252
tests/test-arg-parser \
5353
tests/test-autorelease \
5454
tests/test-backend-ops \
55+
tests/test-chat \
5556
tests/test-chat-template \
5657
tests/test-double-float \
5758
tests/test-grammar-integration \
@@ -983,6 +984,7 @@ OBJ_COMMON = \
983984
$(DIR_COMMON)/ngram-cache.o \
984985
$(DIR_COMMON)/sampling.o \
985986
$(DIR_COMMON)/speculative.o \
987+
$(DIR_COMMON)/chat.o \
986988
$(DIR_COMMON)/build-info.o \
987989
$(DIR_COMMON)/json-schema-to-grammar.o
988990

@@ -1361,6 +1363,8 @@ llama-server: \
13611363
examples/server/httplib.h \
13621364
examples/server/index.html.hpp \
13631365
examples/server/loading.html.hpp \
1366+
common/chat.cpp \
1367+
common/chat.hpp \
13641368
common/chat-template.hpp \
13651369
common/json.hpp \
13661370
common/minja.hpp \
@@ -1471,6 +1475,11 @@ tests/test-json-schema-to-grammar: tests/test-json-schema-to-grammar.cpp \
14711475
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
14721476
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
14731477

1478+
tests/test-chat: tests/test-chat.cpp \
1479+
$(OBJ_ALL)
1480+
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
1481+
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1482+
14741483
tests/test-opt: tests/test-opt.cpp \
14751484
$(OBJ_GGML)
14761485
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)

‎README.md

Copy file name to clipboardExpand all lines: README.md
+1
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
1818

1919
- **How to use [MTLResidencySet](https://developer.apple.com/documentation/metal/mtlresidencyset?language=objc) to keep the GPU memory active?** https://github.com/ggerganov/llama.cpp/pull/11427
2020
- **VS Code extension for FIM completions:** https://github.com/ggml-org/llama.vscode
21+
- Universal tool call support in `llama-server`: https://github.com/ggerganov/llama.cpp/pull/9639
2122
- Vim/Neovim plugin for FIM completions: https://github.com/ggml-org/llama.vim
2223
- Introducing GGUF-my-LoRA https://github.com/ggerganov/llama.cpp/discussions/10123
2324
- Hugging Face Inference Endpoints now support GGUF out of the box! https://github.com/ggerganov/llama.cpp/discussions/9669

‎common/CMakeLists.txt

Copy file name to clipboardExpand all lines: common/CMakeLists.txt
+2
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ add_library(${TARGET} STATIC
5656
arg.cpp
5757
arg.h
5858
base64.hpp
59+
chat.cpp
60+
chat.hpp
5961
chat-template.hpp
6062
common.cpp
6163
common.h

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.