Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add support for llama.cpp's --tensor-split parameter #460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 14, 2023

Conversation

shouyiwang
Copy link

@shouyiwang shouyiwang commented Jul 9, 2023

The current llama-cpp-python does not include support for --tensor-split parameter. When running a large model across two GPUs, it currently loads the model by default in a half-by-half manner. However, this approach presents certain issues. For example, when a user has two GPUs with different VRAM sizes, it can lead to OOM. Implementing the --tensor-split parameter will address this problem by empowering users to define the proportion of the model distributed across multiple GPUs.

I'm uncertain if importing ctypes into llama.py is the most appropriate approach. However, I'm currently unsure of an alternative solution in llama_cpp.py. I would greatly appreciate any advice or suggestions regarding this matter.

Tested thouroughly with text-generation-webui. I'll sumbit a PR there after this PR get merged. Thx!

@shouyiwang
Copy link
Author

Hi @abetlen ,
I just wanted to kindly draw your attention to this PR that I submitted 5 days ago. It would be great if you could review it when you have some time. I am available to make any necessary changes or answer any questions you might have.

Thank you for your time and consideration.

@abetlen
Copy link
Owner

abetlen commented Jul 14, 2023

@shouyiwang thank you for the contribution, lgtm

@abetlen abetlen merged commit 82b11c8 into abetlen:main Jul 14, 2023
@shouyiwang
Copy link
Author

@abetlen Thank you so much!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.