Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ggml : reintegrate the AMX backend into the CPU backend #10359

Copy link
Copy link
Closed
@ggerganov

Description

@ggerganov
Issue body actions

As explained here #10343 (comment), we would like to keep the CPU implementations inside the CPU backend. The AMX backend was created mainly because at the time we didn't support runtime weight repacking. Since now this functionality is supported, we should merge the AMX backend into the CPU backend.

The rough plan to achieve that is outlined here: #10350 (reply in thread)

The plan to reintegrate the AMX backend would be to create a new buffer type that converts the weights to the layout that the AMX backend needs them, and then check in the matrix multiplication the buffer type to determine if the AMX matrix multiplication code should be used. Basically extending the same that is done in #9921 for the aarch64 types.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.