Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Support for multi-modal models #813

Copy link
Copy link
Closed
@rlancemartin

Description

@rlancemartin
Issue body actions

I see LLama.cpp is working on multi-modal models like LLaVA:
ggml-org/llama.cpp#3436

Model is here:

2ab9be51b7dc737136b38093316a4d3577d1fb96281f1589adac7841f5b81c43  ../models/ggml-model-q5_k.gguf
b7c8ff0f58fca47d28ba92c4443adf8653f3349282cb8d9e6911f22d9b3814fe  ../models/mmproj-model-f16.gguf

Testing:

$ mkdir build && cd build && cmake ..
$ cmake --build .
$ ./bin/llava -m ../models/ggml-model-q5_k.gguf --mmproj ../models/mmproj-model-f16.gguf --image ~/Desktop/Papers/figure-3-1.jpg

Appears to add some new params:

--mmproj MMPROJ_FILE  path to a multimodal projector file for LLaVA. see examples/llava/README.md
--image IMAGE_FILE    path to an image file. use with multimodal models

It would be awesome if we can support in llama-cpp-python.

Josh-XT, marscod, light-and-ray, ggerganov, bioshazard and 1 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.