Support computations using float16

We use sgemm to run matmuls for every query. 

We can cut this in half by supporting float16, which MKL/OpenBLAS support.

Requirements:
- We need to be able to toggle this behavior. We should expect some perf loss from the loss in precision.
- We need to decide when float16 computation is allowed. e.g. Are we casting our stored embeddings to float16 instead of float32? Where does that happen?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support computations using float16 #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support computations using float16 #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions