Commits
User selector
Commit History
Commits on Apr 1, 2024
feat: add support for KV cache quantization options (#1307)
Show description for f165048andauthored
Commits on Mar 31, 2024
feat: Add logprobs support to chat completions (#1311)
Show description for aa9f1aeandauthored
Commits on Mar 23, 2024
- committed
Commits on Mar 19, 2024
Commits on Mar 9, 2024
feat: Add endpoints for tokenize, detokenize and count tokens (#1136)
Show description for c139f8bandauthored
Commits on Feb 28, 2024
- committed
- committed
Commits on Feb 26, 2024
feat(server): Add support for pulling models from Huggingface Hub (#1222)
Show description for 4d574bdauthored- committed
Commits on Feb 25, 2024
- committed
Commits on Feb 17, 2024
- committed
Commits on Feb 15, 2024
fix: Use '\n' seperator for EventSourceResponse (#1188)
Show description for ea1f88dandauthored
Commits on Feb 8, 2024
- committed
feat: Integrate functionary v1.4 and v2 models + add custom tokenizer support to Llama class (#1078)
Show description for 9018270andauthored
Commits on Jan 31, 2024
Add speculative decoding (#1120)
Show description for fb762a6authored
Commits on Jan 29, 2024
Automatically set chat format from gguf (#1110)
Show description for da003d8authored
Commits on Jan 25, 2024
Commits on Jan 21, 2024
Commits on Jan 19, 2024
Commits on Jan 18, 2024
- committed
Commits on Jan 16, 2024
Commits on Jan 15, 2024
- committed
Implement GGUF metadata KV overrides (#1011)
Show description for 76aafa6andauthored
Commits on Dec 22, 2023
- committed
server: Support none defaulting to infinity for completions (#111)
Show description for 4b01a87andauthored[Feat] Multi model support (#931)
Show description for 12b7f2fandauthored
Commits on Dec 21, 2023
Commits on Dec 18, 2023
- committed
Bugfix: Remove f16_kv, add offload_kqv field (#1019)
Show description for 62944dfauthored