Relationship to llama.cpp

First of all: CONGRATS ON YOUR AMAZING RESEARCH WORK.

Considering that this is using GGML and seems based directly on `llama.cpp`: 

Why is this a separate project to `llama.cpp`, given that `llama.cpp` already supports BitNet ternary quants? (https://github.com/ggerganov/llama.cpp/pull/8151)

Are these simply more optimised kernels?
If so, how do they compare to llama's implementation?
Can/should they be contributed back to `llama.cpp`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relationship to llama.cpp #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Relationship to llama.cpp #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions