Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

llama : add llama_batch_ext #11875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 61 commits into
base: master
Choose a base branch
Loading
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
4ed4fe7
first proposal for private llama_batch
ngxson Feb 13, 2025
f2e59a8
rework, targeting llama-server
ngxson Feb 14, 2025
17d3658
move to llama_batch_ext
ngxson Feb 15, 2025
85ef80c
server : use llama_batch_ext
ngxson Feb 15, 2025
aed4a8e
fix server
ngxson Feb 16, 2025
4bf7ca3
llama_decode_ext
ngxson Feb 24, 2025
a1b1dea
Merge branch 'master' into xsn/private_batch_api
ngxson Feb 24, 2025
f0ffd81
adapt common
ngxson Mar 1, 2025
9e75c49
Merge branch 'master' into xsn/private_batch_api
ngxson Mar 1, 2025
40989f4
correct llama_decode_ext
ngxson Mar 1, 2025
1170135
llama_batch_ext_add_text
ngxson Mar 1, 2025
1d6ba97
remove token_info API
ngxson Mar 1, 2025
46596ca
apply various in places
ngxson Mar 1, 2025
17f954c
Merge branch 'master' into xsn/private_batch_api
ngxson Mar 13, 2025
86973cb
fix merge errors
ngxson Mar 13, 2025
4aabf4e
return output ID from llama_batch_ext_add/set
ngxson Mar 13, 2025
47086fa
apply to the rest
ngxson Mar 13, 2025
9fb2d81
fix common_batch missing seq_id
ngxson Mar 13, 2025
65f0184
compile ok
ngxson Mar 13, 2025
c3dd790
fix llama_batch_ext_init_from_text
ngxson Mar 13, 2025
04f8641
rm redundant llama_batch_ext_set_output_last
ngxson Mar 13, 2025
54566ad
correct comment
ngxson Mar 13, 2025
bfdddbc
bring back mistakenly deleted llama_batch_init/free
ngxson Mar 13, 2025
5e6a6d4
fix llama-run n_past
ngxson Mar 14, 2025
3294036
fix gemma3-cli
ngxson Mar 14, 2025
07d84fa
fix missing n_past in various places
ngxson Mar 14, 2025
ba79369
fix llama_batch_ext_init_from_embd
ngxson Mar 14, 2025
a363251
qwen2vl: use llama_batch_ext_set_pos
ngxson Mar 14, 2025
8e7714f
fix compile
ngxson Mar 14, 2025
eaffba0
llama_batch_ext_ptr::from_text/embd
ngxson Mar 14, 2025
116b9a1
rename to init_from_text
ngxson Mar 14, 2025
624a683
fix compile
ngxson Mar 14, 2025
de788e0
Update examples/tts/tts.cpp
ngxson Mar 17, 2025
eab5606
Apply suggestions from code review
ngxson Mar 17, 2025
dc4bb64
Merge branch 'master' into xsn/private_batch_api
ngxson Mar 18, 2025
7a3c178
speculative : adapt to new llama API
ggerganov Mar 18, 2025
23d7407
Merge pull request #15 from ggml-org/xsn/private_batch_api
ngxson Mar 19, 2025
b0db7fc
android : adapt to new API
ggerganov Mar 19, 2025
96ca6e8
swift : adapt to new API
ggerganov Mar 19, 2025
32c2c41
android : fix permission
ngxson Mar 19, 2025
6f54ee6
retrieval : avoid common_batch
ggerganov Mar 19, 2025
8b80d68
embedding : avoid common_batch
ggerganov Mar 19, 2025
76fd7d6
perplexity : avoid common_batch
ggerganov Mar 20, 2025
8a23b4a
server : avoid common_batch
ggerganov Mar 20, 2025
b8b1732
server : remove old commented code [no ci]
ggerganov Mar 20, 2025
bd51d63
Merge pull request #16 from ggml-org/xsn/private_batch_api_pooling_none
ngxson Mar 20, 2025
30f1db9
remove C API llama_batch_ext_init_from_text
ngxson Mar 20, 2025
c5a0176
Merge branch 'master' into xsn/private_batch_api
ngxson Mar 21, 2025
2134cab
add cpp batch.add_text wrapper
ngxson Mar 21, 2025
2cec1cf
move various places to batch.add_text
ngxson Mar 21, 2025
3802ff2
add batch.clear() and batch.n_tokens()
ngxson Mar 21, 2025
e8827a6
Merge branch 'master' into xsn/private_batch_api
ngxson Mar 23, 2025
a9efdbb
qwen2vl: fix mrope position
ngxson Mar 23, 2025
1434c2c
Merge branch 'master' into xsn/private_batch_api
ngxson Mar 25, 2025
d18a79e
llama_batch_ext_init with ctx
ngxson Mar 25, 2025
c4fea7f
fix qwzn2vl mrope position input
ngxson Mar 25, 2025
42062cc
fix build
ngxson Mar 25, 2025
56e82d0
fix server
ngxson Mar 25, 2025
50fb396
server: fix batch_spec
ngxson Mar 25, 2025
8ec0ff9
fix embeddings and retrieval
ngxson Mar 27, 2025
c1f4a78
correct output_id for llama-cpp header
ngxson Mar 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge branch 'master' into xsn/private_batch_api
  • Loading branch information
ngxson committed Mar 13, 2025
commit 17f954c8e284b8a76b584b56e426bb76cd9e0079
2 changes: 1 addition & 1 deletion 2 examples/cvector-generator/cvector-generator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ static bool cb_eval(struct ggml_tensor * t, bool ask, void * user_data) {
}

static bool get_hidden_layers(llama_context * ctx, std::vector<llama_token> & tokens) {
llama_kv_cache_clear(ctx);
llama_kv_self_clear(ctx);
llama_batch_ext_ptr batch(llama_batch_ext_init_from_text(tokens.data(), tokens.size(), 0, 0));
if (llama_decode_ext(ctx, batch.get())) {
fprintf(stderr, "%s : failed to eval\n", __func__);
Expand Down
Loading
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.
Morty Proxy This is a proxified and sanitized view of the page, visit original site.