[pull] main from abetlen:main#1
Open
pull[bot] wants to merge 1180 commits intoFreed-Wu:mainFreed-Wu/llama-cpp-python:mainfrom
Open
[pull] main from abetlen:main#1pull[bot] wants to merge 1180 commits intoFreed-Wu:mainFreed-Wu/llama-cpp-python:mainfrom
pull[bot] wants to merge 1180 commits intoFreed-Wu:mainFreed-Wu/llama-cpp-python:mainfrom
Commits
This pull request is big! We're only showing the most recent 250 commits
Commits on May 13, 2024
- committed
Commits on May 14, 2024
- committed
chore(deps): bump pypa/cibuildwheel from 2.17.0 to 2.18.0 (#1453)
Show description for 4b54f79authoredmisc: Remove unnecessary metadata lookups (#1448)
Show description for 389e09cauthoredfeat: add MinTokensLogitProcessor and min_tokens argument to server (#1333)
Show description for 5212fb0authored
Commits on May 16, 2024
Commits on May 17, 2024
Commits on May 18, 2024
Commits on May 22, 2024
- committed
Commits on May 24, 2024
- committed
feat: Improve Llama.eval performance by avoiding list conversion (#1476)
Show description for 5cae104- committed
- committed
- committed
- committed
Commits on May 27, 2024
- committed
chore(deps): bump pypa/cibuildwheel from 2.18.0 to 2.18.1 (#1472)
Show description for c564007
Commits on May 29, 2024
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jun 1, 2024
Commits on Jun 3, 2024
Commits on Jun 4, 2024
- committed
fix: Disable Windows+CUDA workaround when compiling for HIPBLAS (#1493)
Show description for ae5682f- committed
- committed
fix: Avoid duplicate special tokens in chat formats (#1439)
Show description for 027f7bcfix: fix logprobs when BOS is not present (#1471)
Show description for 6e0642cauthoredfeat: adding `rpc_servers` parameter to `Llama` class (#1477)
Show description for d634efc
Commits on Jun 7, 2024
- committed
Commits on Jun 9, 2024
- committed
Commits on Jun 10, 2024
- committed
- committed
Commits on Jun 13, 2024
- committed
feat: Support SPM infill (#1492)
Show description for dbcf64cfeat: Add `.close()` method to `Llama` class to explicitly free model from memory (#1513)
Show description for 320a5d7chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.0 (#1522)
Show description for 5af8163authoredfeat: Update workflows and pre-built wheels (#1416)
Show description for 9e396b3- committed
Commits on Jun 17, 2024
- committed
Commits on Jun 19, 2024
Commits on Jun 20, 2024
Commits on Jun 21, 2024
- committed
- authored
chore(deps): bump docker/build-push-action from 5 to 6 (#1539)
Show description for 398fe81authoredchore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.1 (#1527)
Show description for 35c980eauthored- committed
Commits on Jul 2, 2024
- committed
- committed
- authored
- committed
- committed
- committed
- committed
- committed
- committed
fix: Copy dependencies for windows
Show description for dc20e8ccommitted- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jul 4, 2024
- committed
Commits on Jul 6, 2024
- committed
- committed
Commits on Jul 9, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
chore(deps): bump pypa/cibuildwheel from 2.19.1 to 2.19.2 (#1568)
Show description for 14760c6chore(deps): bump microsoft/setup-msbuild from 1.1 to 1.3 (#1569)
Show description for e31f096feat(ci): Dockerfile update base images and post-install cleanup (#1530)
Show description for b77e507- committed
- committed
feat(ci): Update simple Dockerfile (#1459)
Show description for f7f4fa8
Commits on Jul 17, 2024
- committed
fix(server): Use split_mode from model settings (#1594)
Show description for 66d5cddfix(docs): Update README.md typo (#1589)
Show description for 797f54cauthored
Commits on Jul 18, 2024
fix: Change repeat_penalty to 1.0 to match llama.cpp defaults (#1590)
Show description for 0700476- authored
Commits on Jul 20, 2024
chore(deps): bump microsoft/setup-msbuild from 1.3 to 2 (#1585)
Show description for f95057aauthored
Commits on Jul 22, 2024
- committed
- committed
Commits on Jul 24, 2024
- committed
Commits on Jul 28, 2024
Commits on Jul 31, 2024
- committed
- committed
- authored
- authored
- authored
- authored
- authored
- committed
Commits on Aug 4, 2024
fix: llama_grammar_accept_token arg order (#1649)
Show description for 5575fedauthored
Commits on Aug 7, 2024
feat: Ported back new grammar changes from C++ to Python implementation (#1637)
Show description for dff186c- committed
- committed
- committed
- committed
- committed
- committed
feat: Add more detailed log for prefix-match (#1659)
Show description for e966f3bchore(deps): bump pypa/cibuildwheel from 2.19.2 to 2.20.0 (#1657)
Show description for 131db40authoredfeat: Enable recursive search of HFFS.ls when using `from_pretrained` (#1656)
Show description for 5e39a85
Commits on Aug 8, 2024
Commits on Aug 10, 2024
- committed
Commits on Aug 12, 2024
- committed
- authored
fix: only print 'cache saved' in verbose mode (#1668)
Show description for 9bab46f- committed
Commits on Aug 13, 2024
- committed
Commits on Aug 15, 2024
Commits on Aug 16, 2024
Commits on Aug 19, 2024
- committed
- committed
Commits on Aug 21, 2024
- committed
Commits on Aug 22, 2024
- committed
Commits on Aug 28, 2024
- committed
Commits on Aug 29, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
feat: Enable detokenizing special tokens with `special=True` (#1596)
Show description for d981d32
Commits on Aug 30, 2024
Commits on Aug 31, 2024
- committed
Commits on Sep 2, 2024
- committed
Commits on Sep 5, 2024
- committed
Commits on Sep 6, 2024
- committed
Commits on Sep 18, 2024
feat(ci): Speed up CI workflows using `uv`, add support for CUDA 12.5 wheels
Show description for e529940chore(deps): bump pypa/cibuildwheel from 2.20.0 to 2.21.1 (#1743)
Show description for a4e1451
Commits on Sep 19, 2024
feat: Update sampling API for llama.cpp (#1742)
Show description for f8fcb3eauthored- committed
- committed
fix: Fix memory allocation of ndarray (#1704)
Show description for 22cedad- committed
feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=[...]) . Closes #1341
Show description for 84c0920
Commits on Sep 20, 2024
- committed
- committed
docs: Add cuda 12.5 to README.md (#1750)
Show description for 49b1e73chore(deps): bump actions/cache from 3 to 4 (#1751)
Show description for 1324c0c
Commits on Sep 22, 2024
- committed
Commits on Sep 25, 2024
- committed
- committed
- committed
- committed
Commits on Sep 26, 2024
- committed
misc: Rename all_text to remaining_text (#1658)
Show description for 11d9562- committed
- committed
feat: Expose libggml in internal APIs (#1761)
Show description for 01c7607authored
Commits on Sep 29, 2024
- committed
- committed
Commits on Oct 22, 2024
- committed
Commits on Oct 31, 2024
- committed
Commits on Nov 15, 2024
- committed
Commits on Nov 16, 2024
- committed
- committed
- committed
Commits on Nov 28, 2024
- committed
Commits on Dec 6, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
fix(ci): Remove cuda version 12.5.0 incompatibility with VS (#1838)
Show description for 4f17ae5- committed
chore(deps): bump pypa/cibuildwheel from 2.21.1 to 2.22.0 (#1844)
Show description for 2795303authored- committed
- committed
chore(deps): bump conda-incubator/setup-miniconda from 3.0.4 to 3.1.0 (#1821)
Show description for ddac04cauthoredfix logit-bias type hint (#1802)
Show description for fa04cdcauthoreddocs: Remove ref to llama_eval in llama_cpp.py docs (#1819)
Show description for 38fbd29authored- authored
fix: Re-add suport for CUDA 12.5, add CUDA 12.6 (#1775)
Show description for 77a12a3fix: added missing exit_stack.close() to /v1/chat/completions (#1796)
Show description for 073b7e4authoredfix: Avoid thread starvation on many concurrent requests by making use of asyncio to lock llama_proxy context (#1798)
Show description for 9bd0c95- authored
fix(examples): Refactor Batching notebook to use new sampler chain API (#1793)
Show description for d610477authoredfix: chat API logprobs format (#1788)
Show description for 4f0ec65authored- committed
- committed
- committed
Commits on Dec 9, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- authored
- committed
- committed
Commits on Dec 19, 2024
- committed
Commits on Dec 30, 2024
- committed
Commits on Jan 8, 2025
- committed
fix: streaming resource lock (#1879)
Show description for e8f14ce- committed
Commits on Jan 29, 2025
- committed
- committed
- authored
fix: error showing time spent in llama perf context print (#1898)
Show description for 4442ff8- committed
Commits on Mar 12, 2025
- committed
- committed
- committed
Commits on Apr 11, 2025
- committed
Commits on May 8, 2025
- committed
- committed
- committed