-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Issues: ggml-org/llama.cpp
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
server : separate the notion of position and KV tokens, remove prompt truncation
breaking change
Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility.
examples
python
python script changes
server
#13576
opened May 15, 2025 by
ngxson
Loading…
sycl : reviewing the backend documentation
documentation
Improvements or additions to documentation
examples
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13544
opened May 14, 2025 by
Alcpz
Loading…
webui: Add editing assistant messages (#11849)
examples
server
#13522
opened May 14, 2025 by
lr1729
Loading…
feat(server): Add tool call support to WebUI (LLama Server)
examples
server
#13501
opened May 13, 2025 by
samolego
Loading…
Break down main function in llama-server
examples
server
#13425
opened May 10, 2025 by
ericcurtin
Loading…
Add mistral-chat-7b preset for llama-server
examples
#13348
opened May 7, 2025 by
vahedshaik
Loading…
mtmd : add vision support for llama 4
documentation
Improvements or additions to documentation
examples
help wanted
Extra attention is needed
python
python script changes
#13282
opened May 3, 2025 by
ngxson
Loading…
Support start strings, the opposite of stop tokens.
examples
python
python script changes
server
#13214
opened Apr 30, 2025 by
matteoserva
•
Draft
Support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client
examples
server
#13196
opened Apr 29, 2025 by
matteoserva
Loading…
kv-cache : add SWA support
examples
server
#13194
opened Apr 29, 2025 by
ggerganov
Loading…
15 of 22 tasks
llama : try loading tensors with pre-computed hashes
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
examples
ggml
changes relating to the ggml tensor library for machine learning
Kompute
https://github.com/KomputeProject/kompute/
Nvidia GPU
Issues specific to Nvidia GPUs
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
Vulkan
Issues specific to the Vulkan backend
#13106
opened Apr 25, 2025 by
rgerganov
Loading…
Update README.md for tts example to use afplay on MacOS
examples
#13056
opened Apr 22, 2025 by
maxxam1221
Loading…
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling
examples
ggml
changes relating to the ggml tensor library for machine learning
#12995
opened Apr 17, 2025 by
max-krasnyansky
Loading…
set b = ub when b > ub with embedding
examples
server
#12940
opened Apr 14, 2025 by
ahmedshakill
Loading…
imatrix: add option to display importance score statistics for a given imatrix file
examples
#12718
opened Apr 2, 2025 by
EAddario
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.