Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

rasbt
Copy link
Owner

@rasbt rasbt commented Oct 11, 2025

Adds bonus materials to estimate the memory savings of using Grouped-Query Attention (GQA) over regular Multi-Head Attention (MHA).

@rasbt rasbt merged commit c814814 into main Oct 11, 2025
13 checks passed
@rasbt rasbt deleted the gqa branch October 11, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Morty Proxy This is a proxified and sanitized view of the page, visit original site.