Skip to content

Bug Report: Embedding dimension mismatch for Jina-v3 and others (Returning 1/4 of expected dimensions) #8721

@walcz-de

Description

@walcz-de

Description
When using embedding models via the LocalAI OpenAI-compatible endpoint (specifically /v1/embeddings), the API returns vectors with only 1/4 of their native dimension if the dimensions parameter is omitted in the request.

This issue was primarily observed with jina-embeddings-v3, which returns 256 dimensions instead of its native 1024.

Crucial Observation: This is not limited to models that support Matryoshka Embeddings (like Jina-v3). We have seen similar behavior where LocalAI defaults to truncated output dimensions for other models that do not explicitly support this feature, leading to vector database errors (dimension mismatch) in client applications like AnythingLLM.

Steps to Reproduce
Start LocalAI with jina-embeddings-v3 (or another model like bge-base-en-v1.5) configured.
Send a POST request to /v1/embeddings without specifying the dimensions parameter.
Observe the length of the returned embedding vector.
Example Request:

{
"model": "jina-embeddings-v3",
"input": "This is a test sentence."
}
Expected Behavior
The API should return the full native vector length of the model (e.g., 1024 for jina-embeddings-v3) unless a different dimension is explicitly requested. This is the standard behavior of the OpenAI API which LocalAI aims to emulate.

Actual Behavior
The API returns a truncated vector (e.g., 256 dimensions for jina-embeddings-v3).

Environment
LocalAI Version: [Bitte hier deine Version eintragen, z.B. v2.21.0]
Models tested:
jina-embeddings-v3 (Native: 1024, Received: 256)
[Andere betroffene Modelle hier eintragen]
Client: Official openai JavaScript library (v4.x) / AnythingLLM
Additional Context & Investigation
In AnythingLLM, which uses the official openai Node.js SDK, we encountered persistent dimension mismatch errors. Upon investigation, we found:

SDK Behavior: The standard OpenAI SDK does not send a dimensions field by default if it's not explicitly provided by the user.
Incompatibility: While a real OpenAI endpoint returns the full vector, LocalAI seems to apply an internal default that triggers a "small" output (exactly native / 4) when the parameter is missing.
Non-Matryoshka Models: Even models without native Matryoshka support are affected by this truncation, which suggests a general issue in how the embedding wrapper handles default parameters.
Impact: This causes immediate failures in vector databases (Pinecone, Chroma, LanceDB, etc.) because the returned vector length does not match the collection's defined dimension.
Current Workaround in AnythingLLM: We had to bypass the official OpenAI SDK and implement a manual fetch call to ensure the dimensions parameter is always explicitly sent:

// Workaround to force full dimensions in LocalAI
const body = {
model: "jina-embeddings-v3",
input: chunk,
dimensions: 1024 // Manually forcing 1024 fixes the issue
};

const response = await fetch(${baseURL}/embeddings, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(body),
});
Suggestion
LocalAI should ensure that for all models, the default behavior of the /v1/embeddings endpoint is to return the full native dimension (standard OpenAI-compatible behavior) unless the user explicitly requests a different size. Clients using standard SDKs should not be required to know and send the native dimension just to get a valid full-length vector.

Versions:
LocalAI v3.12.1 (fcecc12) (but has been there in all versions since 3.9.x)
AnythinLLM 1.11.0 (but has been there forever)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions