Closed
Description
I see LLama.cpp is working on multi-modal models like LLaVA:
ggml-org/llama.cpp#3436
Model is here:
2ab9be51b7dc737136b38093316a4d3577d1fb96281f1589adac7841f5b81c43 ../models/ggml-model-q5_k.gguf
b7c8ff0f58fca47d28ba92c4443adf8653f3349282cb8d9e6911f22d9b3814fe ../models/mmproj-model-f16.gguf
Testing:
$ mkdir build && cd build && cmake ..
$ cmake --build .
$ ./bin/llava -m ../models/ggml-model-q5_k.gguf --mmproj ../models/mmproj-model-f16.gguf --image ~/Desktop/Papers/figure-3-1.jpg
Appears to add some new params:
--mmproj MMPROJ_FILE path to a multimodal projector file for LLaVA. see examples/llava/README.md
--image IMAGE_FILE path to an image file. use with multimodal models
It would be awesome if we can support in llama-cpp-python
.
Josh-XT, marscod, light-and-ray, ggerganov, bioshazard and 1 more
Metadata
Metadata
Assignees
Labels
No labels