Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Feature Request: add DeepSeek-v3 support #10981

Copy link
Copy link
Closed
@RodriMora

Description

@RodriMora
Issue body actions

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • Version b4391
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Add support for DeepSeek-v3

https://huggingface.co/deepseek-ai/DeepSeek-V3

Currently not supported:

ERROR:hf-to-gguf:Model DeepseekV3ForCausalLM is not supported

Motivation

DeepSeek-v3 is a big MoE model of 685B params, would be great as offloading to RAM would be a must for most systems

Possible Implementation

There is no model card or technical report yet. I don't know how much different from v2 it is.

Edit: they have uploaded the model card and paper:
https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/README.md

nicoboss, olumolu, TortoiseHam, apepkuss, Vaibhavs10 and 46 morex66ccff, d4rky-pl, mtantawy, ghchris2021, qwagrox and 6 morecodehappy-net, djsavvy, ELigoP, x66ccff, LanYunDev and 12 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.