Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Bump llama.cpp #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 113 commits into from
Mar 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
a20f13f
feat: Update llama.cpp
abetlen Aug 21, 2024
259ee15
feat: Update llama.cpp
abetlen Aug 22, 2024
82ae7f9
feat: Update llama.cpp
abetlen Aug 28, 2024
f70df82
feat: Add MiniCPMv26 chat handler.
abetlen Aug 29, 2024
e251a0b
fix: Update name to MiniCPMv26ChatHandler
abetlen Aug 29, 2024
c68e7fb
fix: pull all gh releases for self-hosted python index
abetlen Aug 29, 2024
97d527e
feat: Add server chat_format minicpm-v-2.6 for MiniCPMv26ChatHandler
abetlen Aug 29, 2024
b570fd3
docs: Add project icon courtesy of 🤗
abetlen Aug 29, 2024
cbbfad4
docs: center icon and resize
abetlen Aug 29, 2024
ad2deaf
docs: Add MiniCPM-V-2.6 to multi-modal model list
abetlen Aug 29, 2024
332720d
feat: Update llama.cpp
abetlen Aug 29, 2024
077ecb6
chore: Bump version
abetlen Aug 29, 2024
45001ac
misc(fix): Update CHANGELOG
abetlen Aug 29, 2024
4b1e364
docs: Update README
abetlen Aug 29, 2024
8b853c0
docs: Update README
abetlen Aug 29, 2024
9cba3b8
docs: Update README
abetlen Aug 29, 2024
d981d32
feat: Enable detokenizing special tokens with `special=True` (#1596)
benniekiss Aug 29, 2024
98eb092
fix: Use system message in og qwen format. Closes #1697
abetlen Aug 30, 2024
dcb0d0c
feat: Update llama.cpp
abetlen Aug 30, 2024
9769e57
feat: Update llama.cpp
abetlen Aug 31, 2024
c3fc80a
feat: Update llama.cpp
abetlen Sep 2, 2024
9497bcd
feat: Update llama.cpp
abetlen Sep 5, 2024
c032fc6
feat: Update llama.cpp
abetlen Sep 6, 2024
e529940
feat(ci): Speed up CI workflows using `uv`, add support for CUDA 12.5…
Smartappli Sep 18, 2024
a4e1451
chore(deps): bump pypa/cibuildwheel from 2.20.0 to 2.21.1 (#1743)
dependabot[bot] Sep 18, 2024
f8fcb3e
feat: Update sampling API for llama.cpp (#1742)
abetlen Sep 19, 2024
1e64664
feat: Update llama.cpp
abetlen Sep 19, 2024
9b64bb5
misc: Format
abetlen Sep 19, 2024
22cedad
fix: Fix memory allocation of ndarray (#1704)
xu-song Sep 19, 2024
29afcfd
fix: Don't store scores internally unless logits_all=True. Reduces me…
abetlen Sep 19, 2024
84c0920
feat: Add loading sharded GGUF files from HuggingFace with Llama.from…
Gnurro Sep 19, 2024
47d7a62
feat: Update llama.cpp
abetlen Sep 20, 2024
6c44a3f
feat: Add option to configure n_ubatch
abetlen Sep 20, 2024
49b1e73
docs: Add cuda 12.5 to README.md (#1750)
Smartappli Sep 20, 2024
1324c0c
chore(deps): bump actions/cache from 3 to 4 (#1751)
dependabot[bot] Sep 20, 2024
4744551
feat: Update llama.cpp
abetlen Sep 22, 2024
926b414
feat: Update llama.cpp
abetlen Sep 25, 2024
b3dfb42
chore: Bump version
abetlen Sep 25, 2024
8e07db0
fix: install build dependency
abetlen Sep 25, 2024
65222bc
fix: install build dependency
abetlen Sep 25, 2024
9992c50
fix: Fix speculative decoding
abetlen Sep 26, 2024
11d9562
misc: Rename all_text to remaining_text (#1658)
xu-song Sep 26, 2024
e975dab
fix: Additional fixes for speculative decoding
abetlen Sep 26, 2024
dca0c9a
feat: Update llama.cpp
abetlen Sep 26, 2024
01c7607
feat: Expose libggml in internal APIs (#1761)
abetlen Sep 26, 2024
57e70bb
feat: Update llama.cpp
abetlen Sep 29, 2024
7c4aead
chore: Bump version
abetlen Sep 29, 2024
7403e00
feat: Update llama.cpp
abetlen Oct 22, 2024
e712cff
feat: Update llama.cpp
abetlen Oct 31, 2024
cafa33e
feat: Update llama.cpp
abetlen Nov 15, 2024
d1cb50b
Add missing ggml dependency
abetlen Nov 16, 2024
2796f4e
Add all missing ggml dependencies
abetlen Nov 16, 2024
7ecdd94
chore: Bump version
abetlen Nov 16, 2024
f3fb90b
feat: Update llama.cpp
abetlen Nov 28, 2024
7ba257e
feat: Update llama.cpp
abetlen Dec 6, 2024
9d06e36
fix(ci): Explicitly install arm64 python version
abetlen Dec 6, 2024
fb0b8fe
fix(ci): Explicitly set cmake osx architecture
abetlen Dec 6, 2024
72ed7b8
fix(ci): Explicitly test on arm64 macos runner
abetlen Dec 6, 2024
8988aaf
fix(ci): Use macos-14 runner
abetlen Dec 6, 2024
f11a781
fix(ci): Use macos-13 runner
abetlen Dec 6, 2024
9a09fc7
fix(ci): Debug print python system architecture
abetlen Dec 6, 2024
a412ba5
fix(ci): Update config
abetlen Dec 6, 2024
df05096
fix(ci): Install with regular pip
abetlen Dec 6, 2024
1cd3f2c
fix(ci): gg
abetlen Dec 6, 2024
b34f200
fix(ci): Use python3
abetlen Dec 6, 2024
d8cc231
fix(ci): Use default architecture chosen by action
abetlen Dec 6, 2024
d5d5099
fix(ci): Update CMakeLists.txt for macos
abetlen Dec 6, 2024
4f17ae5
fix(ci): Remove cuda version 12.5.0 incompatibility with VS (#1838)
pabl-o-ce Dec 6, 2024
991d9cd
fix(ci): Remove CUDA 12.5 from index
abetlen Dec 6, 2024
2795303
chore(deps): bump pypa/cibuildwheel from 2.21.1 to 2.22.0 (#1844)
dependabot[bot] Dec 6, 2024
2523472
fix: Fix pickling of Llama class by setting seed from _seed member. C…
abetlen Dec 6, 2024
d553a54
Merge branch 'main' of github.com:abetlen/llama-cpp-python into main
abetlen Dec 6, 2024
ddac04c
chore(deps): bump conda-incubator/setup-miniconda from 3.0.4 to 3.1.0…
dependabot[bot] Dec 6, 2024
fa04cdc
fix logit-bias type hint (#1802)
ddh0 Dec 6, 2024
38fbd29
docs: Remove ref to llama_eval in llama_cpp.py docs (#1819)
richdougherty Dec 6, 2024
4192210
fix: make content not required in ChatCompletionRequestAssistantMessa…
feloy Dec 6, 2024
77a12a3
fix: Re-add suport for CUDA 12.5, add CUDA 12.6 (#1775)
Smartappli Dec 6, 2024
073b7e4
fix: added missing exit_stack.close() to /v1/chat/completions (#1796)
Ian321 Dec 6, 2024
9bd0c95
fix: Avoid thread starvation on many concurrent requests by making us…
gjpower Dec 6, 2024
1ea6154
fix(docs): Update development instructions (#1833)
Florents-Tselai Dec 6, 2024
d610477
fix(examples): Refactor Batching notebook to use new sampler chain AP…
lukestanley Dec 6, 2024
4f0ec65
fix: chat API logprobs format (#1788)
domdomegg Dec 6, 2024
df136cb
misc: Update development Makefile
abetlen Dec 6, 2024
6889429
Merge branch 'main' of github.com:abetlen/llama-cpp-python into main
abetlen Dec 6, 2024
b9b50e5
misc: Update run server command
abetlen Dec 6, 2024
5585f8a
feat: Update llama.cpp
abetlen Dec 9, 2024
61508c2
Add CUDA 12.5 and 12.6 to generated output wheels
abetlen Dec 9, 2024
a9fe0f8
chore: Bump version
abetlen Dec 9, 2024
ca80802
fix(ci): hotfix for wheels
abetlen Dec 9, 2024
002f583
chore: Bump version
abetlen Dec 9, 2024
ea4d86a
fix(ci): update macos runner image to non-deprecated version
abetlen Dec 9, 2024
afedfc8
fix: add missing await statements for async exit_stack handling (#1858)
gjpower Dec 9, 2024
801a73a
feat: Update llama.cpp
abetlen Dec 9, 2024
803924b
chore: Bump version
abetlen Dec 9, 2024
2bc1d97
feat: Update llama.cpp
abetlen Dec 19, 2024
c9dfad4
feat: Update llama.cpp
abetlen Dec 30, 2024
1d5f534
feat: Update llama.cpp
abetlen Jan 8, 2025
e8f14ce
fix: streaming resource lock (#1879)
gjpower Jan 8, 2025
0580cf2
chore: Bump version
abetlen Jan 8, 2025
80be68a
feat: Update llama.cpp
abetlen Jan 29, 2025
0b89fe4
feat: Update llama.cpp
abetlen Jan 29, 2025
14879c7
fix(ci): Fix the CUDA workflow (#1894)
oobabooga Jan 29, 2025
4442ff8
fix: error showing time spent in llama perf context print (#1898)
shakalaca Jan 29, 2025
710e19a
chore: Bump version
abetlen Jan 29, 2025
0a8f97d
Merge branch 'main' into experiment_bump_llama_cpp
tc-wolf Mar 12, 2025
70d1048
Fix for List typehint
tc-wolf Mar 12, 2025
15bf3e8
Update state functions + formatting
tc-wolf Mar 14, 2025
e5cccf4
Fixup reloading
tc-wolf Mar 17, 2025
c9bf03a
Fix some logic
tc-wolf Mar 17, 2025
5de50b9
Update tests for cache
tc-wolf Mar 17, 2025
68d081d
Update Dockerfile
tc-wolf Mar 18, 2025
aff151d
Update Makefile
tc-wolf Mar 18, 2025
6235674
Remove unnecessary (wrong) check
tc-wolf Mar 18, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 42 additions & 10 deletions 52 .github/workflows/build-and-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-20.04, windows-2019, macos-12]
os: [ubuntu-20.04, windows-2019, macos-13]

steps:
- uses: actions/checkout@v4
Expand All @@ -21,15 +21,28 @@ jobs:
# Used to host cibuildwheel
- uses: actions/setup-python@v5
with:
python-version: "3.8"
python-version: "3.9"

- name: Install dependencies
- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip
python -m pip install -e .[all]
python -m pip install uv
RUST_LOG=trace python -m uv pip install -e .[all] --verbose
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install uv
python -m uv pip install -e .[all] --verbose
shell: cmd

- name: Build wheels
uses: pypa/cibuildwheel@v2.20.0
uses: pypa/cibuildwheel@v2.22.0
env:
# disable repair
CIBW_REPAIR_WHEEL_COMMAND: ""
Expand All @@ -56,7 +69,7 @@ jobs:
platforms: linux/arm64

- name: Build wheels
uses: pypa/cibuildwheel@v2.20.0
uses: pypa/cibuildwheel@v2.22.0
env:
CIBW_SKIP: "*musllinux* pp*"
CIBW_REPAIR_WHEEL_COMMAND: ""
Expand All @@ -79,16 +92,35 @@ jobs:
- uses: actions/checkout@v4
with:
submodules: "recursive"

- uses: actions/setup-python@v5
with:
python-version: "3.8"
- name: Install dependencies
python-version: "3.9"

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip build
python -m pip install -e .[all]
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install -e .[all] --verbose
python -m uv pip install build
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install uv
python -m uv pip install -e .[all] --verbose
python -m uv pip install build
shell: cmd

- name: Build source distribution
run: |
python -m build --sdist

- uses: actions/upload-artifact@v4
with:
name: sdist
Expand Down
10 changes: 4 additions & 6 deletions 10 .github/workflows/build-wheels-cuda.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
$matrix = @{
'os' = @('ubuntu-latest', 'windows-2019')
'pyver' = @("3.9", "3.10", "3.11", "3.12")
'cuda' = @("12.1.1", "12.2.2", "12.3.2", "12.4.1")
'cuda' = @("12.1.1", "12.2.2", "12.3.2", "12.4.1") #, "12.5.1", "12.6.1")
'releasetag' = @("basic")
}

Expand Down Expand Up @@ -59,20 +59,18 @@ jobs:
cache: 'pip'

- name: Setup Mamba
uses: conda-incubator/setup-miniconda@v3.0.4
uses: conda-incubator/setup-miniconda@v3.1.0
with:
activate-environment: "build"
activate-environment: "llamacpp"
python-version: ${{ matrix.pyver }}
miniforge-variant: Mambaforge
miniforge-version: latest
use-mamba: true
add-pip-as-python-dependency: true
auto-activate-base: false

- name: VS Integration Cache
id: vs-integration-cache
if: runner.os == 'Windows'
uses: actions/cache@v4.0.2
uses: actions/cache@v4
with:
path: ./MSBuildExtensions
key: cuda-${{ matrix.cuda }}-vs-integration
Expand Down
21 changes: 17 additions & 4 deletions 21 .github/workflows/build-wheels-metal.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [macos-12, macos-13, macos-14]
os: [macos-13, macos-14, macos-15]

steps:
- uses: actions/checkout@v4
Expand All @@ -23,14 +23,27 @@ jobs:
with:
python-version: "3.12"
cache: 'pip'

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install -e .[all] --verbose
shell: bash

- name: Install dependencies
- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install -e .[all]
python -m pip install uv
python -m uv pip install -e .[all] --verbose
shell: cmd

- name: Build wheels
uses: pypa/cibuildwheel@v2.20.0
uses: pypa/cibuildwheel@v2.22.0
env:
# disable repair
CIBW_REPAIR_WHEEL_COMMAND: ""
Expand Down
5 changes: 5 additions & 0 deletions 5 .github/workflows/generate-index-from-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,17 @@ jobs:
- name: Setup Pages
uses: actions/configure-pages@v5
- name: Build
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
./scripts/get-releases.sh
./scripts/releases-to-pep-503.sh index/whl/cpu '^[v]?[0-9]+\.[0-9]+\.[0-9]+$'
./scripts/releases-to-pep-503.sh index/whl/cu121 '^[v]?[0-9]+\.[0-9]+\.[0-9]+-cu121$'
./scripts/releases-to-pep-503.sh index/whl/cu122 '^[v]?[0-9]+\.[0-9]+\.[0-9]+-cu122$'
./scripts/releases-to-pep-503.sh index/whl/cu123 '^[v]?[0-9]+\.[0-9]+\.[0-9]+-cu123$'
./scripts/releases-to-pep-503.sh index/whl/cu124 '^[v]?[0-9]+\.[0-9]+\.[0-9]+-cu124$'
# ./scripts/releases-to-pep-503.sh index/whl/cu125 '^[v]?[0-9]+\.[0-9]+\.[0-9]+-cu124$'
# ./scripts/releases-to-pep-503.sh index/whl/cu126 '^[v]?[0-9]+\.[0-9]+\.[0-9]+-cu124$'
./scripts/releases-to-pep-503.sh index/whl/metal '^[v]?[0-9]+\.[0-9]+\.[0-9]+-metal$'
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
Expand Down
24 changes: 21 additions & 3 deletions 24 .github/workflows/publish-to-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,24 +19,42 @@ jobs:
- uses: actions/checkout@v4
with:
submodules: "recursive"

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: 'pip'

- name: Append Dev Version to __version__
run: |
DEV_VERSION=${{ github.event.inputs.dev_version }}
CURRENT_VERSION=$(awk -F= '/__version__ =/ {print $2}' llama_cpp/__init__.py | tr -d ' "')
NEW_VERSION="${CURRENT_VERSION}.dev${DEV_VERSION}"
sed -i 's/__version__ = \".*\"/__version__ = \"'"${NEW_VERSION}"'\"/' llama_cpp/__init__.py
- name: Install dependencies

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip build
python -m pip install -e .[all]
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install -e .[all] --verbose
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install uv
python -m uv pip install -e .[all] --verbose
shell: cmd

- name: Build source distribution
run: |
python -m build --sdist

- name: Publish to Test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
Expand Down
25 changes: 22 additions & 3 deletions 25 .github/workflows/publish.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,36 @@ jobs:
- uses: actions/checkout@v4
with:
submodules: "recursive"

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.9"
- name: Install dependencies

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install -e .[all] --verbose
python -m uv pip install build
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip build
python -m pip install -e .[all]
python -m pip install --upgrade pip
python -m pip install uv
python -m uv pip install -e .[all] --verbose
python -m uv pip install build
shell: cmd

- name: Build source distribution
run: |
python -m build --sdist

- name: Publish distribution to PyPI
# TODO: move to tag based releases
# if: startsWith(github.ref, 'refs/tags')
Expand Down
59 changes: 52 additions & 7 deletions 59 .github/workflows/test-pypi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,25 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install llama-cpp-python[all] --verbose
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install --verbose llama-cpp-python[all]
python -m pip install uv
python -m uv pip install llama-cpp-python[all] --verbose
shell: cmd

- name: Test with pytest
run: |
python -c "import llama_cpp"
Expand All @@ -37,10 +52,25 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install llama-cpp-python[all] --verbose
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install --verbose llama-cpp-python[all]
python -m pip install uv
python -m uv pip install llama-cpp-python[all] --verbose
shell: cmd

- name: Test with pytest
run: |
python -c "import llama_cpp"
Expand All @@ -57,11 +87,26 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies
cache: 'pip'

- name: Install dependencies (Linux/MacOS)
if: runner.os != 'Windows'
run: |
python -m pip install --upgrade pip
python -m pip install uv
RUST_LOG=trace python -m uv pip install llama-cpp-python[all] --verbose
shell: bash

- name: Install dependencies (Windows)
if: runner.os == 'Windows'
env:
RUST_LOG: trace
run: |
python -m pip install --upgrade pip
python -m pip install --verbose llama-cpp-python[all]
python -m pip install uv
python -m uv pip install llama-cpp-python[all] --verbose
shell: cmd

- name: Test with pytest
run: |
python -c "import llama_cpp"
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.