Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit eb3a2d5

Browse filesBrowse files
ochafiktybalex
authored andcommitted
json-schema-to-grammar improvements (+ added to server) (ggml-org#5978)
* json: fix arrays (disallow `[,1]`) * json: support tuple types (`[number, string]`) * json: support additionalProperties (`{[k: string]: [string,number][]}`) * json: support required / optional properties * json: add support for pattern * json: resolve $ref (and support https schema urls) * json: fix $ref resolution * join: support union types (mostly for nullable types I think) * json: support allOf + nested anyOf * json: support any (`{}` or `{type: object}`) * json: fix merge * json: temp fix for escapes * json: spaces in output and unrestricted output spaces * json: add typings * json:fix typo * Create ts-type-to-grammar.sh * json: fix _format_literal (json.dumps already escapes quotes) * json: merge lit sequences and handle negatives {"type": "string", "pattern": "^({\"question\": \"[^\"]+\", \"response\": \"[^\"]+\"}\\n)+$"} * json: handle pattern repetitions * Update json-schema-to-grammar.mjs * Create regex-to-grammar.py * json: extract repeated regexp patterns to subrule * Update json-schema-to-grammar.py * Update json-schema-to-grammar.py * Update json-schema-to-grammar.py * json: handle schema from pydantic Optional fields * Update json-schema-to-grammar.py * Update json-schema-to-grammar.py * Update ts-type-to-grammar.sh * Update ts-type-to-grammar.sh * json: simplify nullable fields handling * json: accept duplicate identical rules * json: revert space to 1 at most * json: reuse regexp pattern subrules * json: handle uuid string format * json: fix literal escapes * json: add --allow-fetch * json: simplify range escapes * json: support negative ranges in patterns * Delete commit.txt * json: custom regex parser, adds dot support & JS-portable * json: rm trailing spaces * Update json-schema-to-grammar.mjs * json: updated server & chat `( cd examples/server && ./deps.sh )` * json: port fixes from mjs to python * Update ts-type-to-grammar.sh * json: support prefixItems alongside array items * json: add date format + fix uuid * json: add date, time, date-time formats * json: preserve order of props from TS defs * json: port schema converter to C++, wire in ./server * json: nits * Update json-schema-to-grammar.cpp * Update json-schema-to-grammar.cpp * Update json-schema-to-grammar.cpp * json: fix mjs implementation + align outputs * Update json-schema-to-grammar.mjs.hpp * json: test C++, JS & Python versions * json: nits + regen deps * json: cleanup test * json: revert from c++17 to 11 * json: nit fixes * json: dirty include for test * json: fix zig build * json: pass static command to std::system in tests (fixed temp files) * json: fix top-level $refs * json: don't use c++20 designated initializers * nit * json: basic support for reserved names `{number:{number:{root:number}}}` * Revamp test cmake to allow args (WORKING_DIRECTORY needed for JSON test) * json: re-ran server deps.sh * json: simplify test * json: support mix of additional props & required/optional * json: add tests for some expected failures * json: fix type=const in c++, add failure expectations for non-str const&enum * json: test (& simplify output of) empty schema * json: check parsing in test + fix value & string refs * json: add server tests for OAI JSON response_format * json: test/fix top-level anyOf * json: improve grammar parsing failures * json: test/fix additional props corner cases * json: fix string patterns (was missing quotes) * json: ws nit * json: fix json handling in server when there's no response_format * json: catch schema conversion errors in server * json: don't complain about unknown format type in server if unset * json: cleaner build of test * json: create examples/json-schema-pydantic-example.py * json: fix date pattern * json: move json.hpp & json-schema-to-grammar.{cpp,h} to common * json: indent 4 spaces * json: fix naming of top-level c++ function (+ drop unused one) * json: avoid using namespace std * json: fix zig build * Update server.feature * json: iostream -> fprintf * json: space before & refs for consistency * json: nits
1 parent 0fcf304 commit eb3a2d5
Copy full SHA for eb3a2d5
Expand file treeCollapse file tree

28 files changed

+7572
-3536
lines changed

‎.gitignore

Copy file name to clipboardExpand all lines: .gitignore
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
*.gcda
1212
*.dot
1313
*.bat
14+
*.tmp
1415
*.metallib
1516
*.etag
1617
*.lastModified

‎Makefile

Copy file name to clipboardExpand all lines: Makefile
+9-1Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ TEST_TARGETS = \
99
tests/test-llama-grammar tests/test-grammar-parser tests/test-double-float tests/test-grad0 tests/test-opt \
1010
tests/test-quantize-fns tests/test-quantize-perf tests/test-sampling tests/test-tokenizer-0-llama \
1111
tests/test-tokenizer-0-falcon tests/test-tokenizer-1-llama tests/test-tokenizer-1-bpe tests/test-rope \
12-
tests/test-backend-ops tests/test-model-load-cancel tests/test-autorelease
12+
tests/test-backend-ops tests/test-model-load-cancel tests/test-autorelease \
13+
tests/test-json-schema-to-grammar
1314

1415
# Code coverage output files
1516
COV_TARGETS = *.gcno tests/*.gcno *.gcda tests/*.gcda *.gcov tests/*.gcov lcov-report gcovr-report
@@ -666,6 +667,9 @@ console.o: common/console.cpp common/console.h
666667
grammar-parser.o: common/grammar-parser.cpp common/grammar-parser.h
667668
$(CXX) $(CXXFLAGS) -c $< -o $@
668669

670+
json-schema-to-grammar.o: common/json-schema-to-grammar.cpp common/json-schema-to-grammar.h
671+
$(CXX) $(CXXFLAGS) -c $< -o $@
672+
669673
train.o: common/train.cpp common/train.h
670674
$(CXX) $(CXXFLAGS) -c $< -o $@
671675

@@ -871,6 +875,10 @@ tests/test-double-float: tests/test-double-float.cpp ggml.o $(OBJS)
871875
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
872876
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
873877

878+
tests/test-json-schema-to-grammar: tests/test-json-schema-to-grammar.cpp json-schema-to-grammar.o ggml.o llama.o grammar-parser.o $(OBJS)
879+
$(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
880+
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
881+
874882
tests/test-grad0: tests/test-grad0.cpp ggml.o $(OBJS)
875883
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
876884
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

‎build.zig

Copy file name to clipboardExpand all lines: build.zig
+2-1Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,7 @@ pub fn build(b: *std.build.Builder) !void {
122122
const console = make.obj("console", "common/console.cpp");
123123
const sampling = make.obj("sampling", "common/sampling.cpp");
124124
const grammar_parser = make.obj("grammar-parser", "common/grammar-parser.cpp");
125+
const json_schema_to_grammar = make.obj("json-schema-to-grammar", "common/json-schema-to-grammar.cpp");
125126
const train = make.obj("train", "common/train.cpp");
126127
const clip = make.obj("clip", "examples/llava/clip.cpp");
127128
const llava = make.obj("llava", "examples/llava/llava.cpp");
@@ -133,7 +134,7 @@ pub fn build(b: *std.build.Builder) !void {
133134
_ = make.exe("finetune", "examples/finetune/finetune.cpp", &.{ ggml, ggml_alloc, ggml_backend, ggml_quants, llama, unicode, common, buildinfo, train });
134135
_ = make.exe("train-text-from-scratch", "examples/train-text-from-scratch/train-text-from-scratch.cpp", &.{ ggml, ggml_alloc, ggml_backend, ggml_quants, llama, unicode, common, buildinfo, train });
135136

136-
const server = make.exe("server", "examples/server/server.cpp", &.{ ggml, ggml_alloc, ggml_backend, ggml_quants, llama, unicode, common, buildinfo, sampling, grammar_parser, clip, llava });
137+
const server = make.exe("server", "examples/server/server.cpp", &.{ ggml, ggml_alloc, ggml_backend, ggml_quants, llama, unicode, common, buildinfo, sampling, grammar_parser, json_schema_to_grammar, clip, llava });
137138
if (server.target.isWindows()) {
138139
server.linkSystemLibrary("ws2_32");
139140
}

‎common/CMakeLists.txt

Copy file name to clipboardExpand all lines: common/CMakeLists.txt
+3Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ if (BUILD_SHARED_LIBS)
4747
set_target_properties(${TARGET} PROPERTIES POSITION_INDEPENDENT_CODE ON)
4848
endif()
4949

50+
set(TARGET json-schema-to-grammar)
51+
add_library(${TARGET} OBJECT json-schema-to-grammar.cpp json-schema-to-grammar.h)
5052

5153
set(TARGET common)
5254

@@ -60,6 +62,7 @@ add_library(${TARGET} STATIC
6062
console.cpp
6163
grammar-parser.h
6264
grammar-parser.cpp
65+
json.hpp
6366
train.h
6467
train.cpp
6568
)

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.