Sneak Peek: NexaQuant models! #418

Mar 14, 2025

iwr-redmond
Mar 14, 2025

What if you could run a fast quantized LLM and get performance similar to an unquantized model requiring four times the resources? That's exactly what the new NexaQuant models deliver.

Based on the popular DeepSeek R1, the NexaQuant distilled models have reasoning capabilities close to, or in some cases even exceeding!, their unquantized R1-distill sources.

Get started now with Nexa SDK:

DeepSeek R1 Distill Llama 8B: nexa run DeepSeek-R1-Distill-Llama-8B-NexaQuant:q4_0

DeepSeek R1 Distill Qwen 1.5B: nexa run DeepSeek-R1-Distill-Qwen-1.5B-NexaQuant:q4_0

* Disclaimer: I'm not affiliated with Nexa AI. Information presented is unofficial and subject to change without notice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sneak Peek: NexaQuant models! #418

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Search code, repositories, users, issues, pull requests...

Sneak Peek: NexaQuant models! #418

Uh oh!

Uh oh!

iwr-redmond Mar 14, 2025

Replies: 0 comments

iwr-redmond
Mar 14, 2025