Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Tags: duaneking/llama.cpp

Tags

master-924dd22

Toggle master-924dd22's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Quantized dot products for CUDA mul mat vec (ggml-org#2067)

master-051c70d

Toggle master-051c70d's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
llama: Don't double count the sampling time (ggml-org#2107)

master-9e4475f

Toggle master-9e4475f's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fixed OpenCL offloading prints (ggml-org#2082)

master-f257fd2

Toggle master-f257fd2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Add an API example using server.cpp similar to OAI. (ggml-org#2009)

* add api_like_OAI.py
* add evaluated token count to server
* add /v1/ endpoints binding

master-ed9a54e

Toggle master-ed9a54e's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
ggml : sync latest (new ops, macros, refactoring) (ggml-org#2106)

- add ggml_argmax()
- add ggml_tanh()
- add ggml_elu()
- refactor ggml_conv_1d() and variants
- refactor ggml_conv_2d() and variants
- add helper macros to reduce code duplication in ggml.c

master-acc111c

Toggle master-acc111c's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Allow old Make to build server. (ggml-org#2098)

Also make server build by default.

Tested with Make 3.82

master-23c7c6f

Toggle master-23c7c6f's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Update Makefile: clean simple (ggml-org#2097)

master-7f0e9a7

Toggle master-7f0e9a7's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
embd-input: Fix input embedding example unsigned int seed (ggml-org#2105

)

master-7ee76e4

Toggle master-7ee76e4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Simple webchat for server (ggml-org#1998)

* expose simple web interface on root domain

* embed index and add --path for choosing static dir

* allow server to multithread

because web browsers send a lot of garbage requests we want the server
to multithread when serving 404s for favicon's etc. To avoid blowing up
llama we just take a mutex when it's invoked.


* let's try this with the xxd tool instead and see if msvc is happier with that

* enable server in Makefiles

* add /completion.js file to make it easy to use the server from js

* slightly nicer css

* rework state management into session, expose historyTemplate to settings

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

master-698efad

Toggle master-698efad's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
CI: make the brew update temporarily optional. (ggml-org#2092)

until they decide to fix the brew installation in the macos runners.
see the open issues. eg actions/runner-images#7710
Morty Proxy This is a proxified and sanitized view of the page, visit original site.