Bump llama.cpp from `2347e45` to `254a7a7` #103

dependabot · Jun 14, 2023

Bumps llama.cpp from 2347e45 to 254a7a7.

Commits

254a7a7 CUDA full GPU acceleration, KV cache in VRAM (#1827)
9254920 baby-llama : fix operator!= (#1821)
e32089b train : improved training-from-scratch example (#1652)
See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [llama.cpp](https://github.com/ggerganov/llama.cpp) from `2347e45` to `254a7a7`. - [Release notes](https://github.com/ggerganov/llama.cpp/releases) - [Commits](ggml-org/llama.cpp@2347e45...254a7a7) --- updated-dependencies: - dependency-name: llama.cpp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>

deadprogram

This is the commit from llama.cpp that brings in CUDA support: ggml-org/llama.cpp#1827

Let's merge this!

mudler · Jun 15, 2023

Yes! Definitely! Problem is, I gave a try it yesterday , but there seems to be issues still. Current master works ( no full offloading), but bringing this in seems to break it entirely.

I'm testing this with a box with a GPU given from a community member ( as I don't have one to try it myself ), and seems to fail into offloading to the GPU. @deadprogram did you tried it already on your GPU?

mudler · Jun 15, 2023

for reference, the error running the example:

ggml_init_cublas: found 2 CUDA devices:
  Device 0: Tesla T4                                                                                      
  Device 1: Tesla T4     
llama.cpp: loading model from /home/ubuntu/WizardLM-7B-uncensored.ggmlv3.q4_0.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32001
llama_model_load_internal: n_ctx      = 128 
llama_model_load_internal: n_embd     = 4096                                                              
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32      
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128                                                               
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 11008                                                             
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B                                                                
Model loaded successfully. 
>>> w                                                                                                     
                                                     
Sending w

LLAMA_ASSERT: /home/ubuntu/go-llama.cpp/llama.cpp/llama.cpp:1372: !!kv_self.ctx
SIGABRT: abort
PC=0x7f2d1026ea7c m=0 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 1 [syscall]:

dependabot · Jun 15, 2023

Superseded by #104.

dependabot bot added the dependencies Pull requests that update a dependency file label Jun 14, 2023

dependabot bot mentioned this pull request Jun 14, 2023

Bump llama.cpp from e4caa8d to 2347e45 #100

Closed

deadprogram approved these changes Jun 15, 2023

View reviewed changes

dependabot bot closed this Jun 15, 2023

dependabot bot deleted the dependabot/submodules/llama.cpp-254a7a7 branch June 15, 2023 19:03

mudler mentioned this pull request Jun 15, 2023

Add support for full CUDA GPU offloading #105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bump llama.cpp from `2347e45` to `254a7a7` #103

Bump llama.cpp from `2347e45` to `254a7a7` #103

Uh oh!

dependabot bot commented on behalf of github Jun 14, 2023

Uh oh!

deadprogram left a comment

Uh oh!

mudler commented Jun 15, 2023 •

edited

Loading

Uh oh!

mudler commented Jun 15, 2023

Uh oh!

dependabot bot commented on behalf of github Jun 15, 2023

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

Uh oh!

Bump llama.cpp from 2347e45 to 254a7a7 #103

Bump llama.cpp from 2347e45 to 254a7a7 #103

Uh oh!

Conversation

dependabot bot commented on behalf of github Jun 14, 2023

Uh oh!

deadprogram left a comment

Choose a reason for hiding this comment

Uh oh!

mudler commented Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mudler commented Jun 15, 2023

Uh oh!

dependabot bot commented on behalf of github Jun 15, 2023

Uh oh!

Uh oh!

Bump llama.cpp from `2347e45` to `254a7a7` #103

Bump llama.cpp from `2347e45` to `254a7a7` #103

mudler commented Jun 15, 2023 •

edited

Loading