Vector Database REST API

REST API to index and query documents in a vector database: k-Nearest Neighbor (kNN) search over embeddings. The API is containerized with Docker (or Podman).

Objective

Develop a REST API that allows users to index and query their documents within a Vector Database. The API is containerized in a Docker container.

Definitions

Chunk: A piece of text with an associated embedding and metadata.
Document: Made of multiple chunks and metadata.
Library: Made of a list of documents and metadata.

What the API does

CRUD libraries — create, read, update, delete.
CRUD documents and chunks within a library.
Index the contents of a library.
k-Nearest Neighbor vector search over the selected library with a given embedding query.

Data flow

flowchart LR
    subgraph Client
        A[HTTP Client]
    end
    subgraph API
        B[FastAPI routes]
    end
    subgraph Services
        C[LibraryService]
        D[DocumentService]
        E[ChunkService]
        F[SearchService]
        G[IndexService]
    end
    subgraph Data
        H[(Repositories)]
        I[(Index Registry)]
    end
    A --> B
    B --> C
    B --> D
    B --> E
    B --> F
    B --> G
    C --> H
    D --> H
    E --> H
    F --> H
    F --> I
    G --> H
    G --> I

Request → API (routes) → Services (business logic) → Repositories (in-memory store) and Index Registry (kNN indexes).
Reads use a shared read lock; writes use an exclusive write lock so there are no data races.

How it was implemented

1. Chunk, Document, Library (Pydantic, fixed schema)

Pydantic models with a fixed schema: no user-defined metadata fields.
Chunk: id, text, embedding (list of floats), created_at, name (optional), document_id.
Document: id, name, created_at, chunk_ids, library_id.
Library: id, name, created_at, document_ids.
All IDs are UUIDs generated by the application.

2. Indexing algorithms (no external vector DB libraries)

Brute-force: Linear scan; compare query to every vector. Build: O(n·d), Query: O(n·d), Space: O(n·d). Baseline, exact k-NN.
KD-Tree: Tree over vectors (median split). Build: O(n log n · d), Query: O(log n) typical, Space: O(n·d). Exact k-NN; good for moderate dimensions.
IVF: K-means clusters; search in nearest cluster(s). Build: O(n·d·C·I), Query: O(C·d + m·d), Space: O(n·d). Approximate k-NN for larger n.

Only numpy is used for math (norms, etc.). No chroma-db, pinecone, FAISS, etc.

3. Concurrency (no data races)

Reader-writer lock: Reads (get, list, search) hold a shared read lock; writes (create, update, delete, build index) hold an exclusive write lock.
Design: one lock around repositories and index registry; simple and correct for a single-process, in-memory API.

4. CRUD logic (Services)

Services implement the logic: Library, Document, Chunk, Search, Index.
Repositories do CRUD only (in-memory); Services orchestrate them and keep relationships consistent (e.g. document_ids on Library, chunk_ids on Document, cascade deletes).

5. API layer

FastAPI routes call Services; routes are thin (parse request, call service, map to HTTP status).
Status codes: 200, 201, 204, 404, 409, 422 via fastapi.status (no hardcoded numbers).
REST: POST/GET/PUT/DELETE for libraries, documents, chunks; POST /libraries/{id}/index to build index; POST /libraries/{id}/search with body {"embedding": [...], "k": N} for k-NN search.

6. Docker image

Dockerfile (multi-stage): build with uv, runtime with uvicorn. Works with Docker and Podman.
docker-compose (or podman-compose): one command to run the API (and optional UI). No need to run anything on the host except the containers.

Running the project

Prerequisites: Docker or Podman, and (optional) uv for local dev/tests.

# From project root
docker compose up
# or: podman compose up

API: http://localhost:8000
Interactive docs: http://localhost:8000/docs

Optional: copy env.example to .env and set COHERE_API_KEY if you use Cohere for embeddings. The assignment suggested a Cohere API key for creating embeddings for tests; manually created chunks suffice to test the system.

Tests (in container):

docker build --target test -t vector-db-api-test .
docker run --rm vector-db-api-test
# or with podman

Constraints respected

No chroma-db, pinecone, FAISS, or similar; indexing algorithms are implemented with numpy only.
No document processing pipeline required; manually created chunks are enough to test the system.

Tech stack

API backend: Python + FastAPI + Pydantic
Dependency management: uv (pyproject.toml, uv sync)
Containers: Dockerfile + docker-compose (Podman-compatible)

Extras (optional, documented here)

The following were not required by the task; they are implemented and documented below with Mermaid diagrams.

Embedders (Cohere, Sentence Transformers, Image)

Embeddings are resolved by URI. The registry chooses the backend; invalid or missing HF_TOKEN does not break the flow (fallback to unauthenticated Hub access).

flowchart LR
    subgraph Input
        T[Text chunks]
        Q[Query text]
        Im[Images]
    end
    subgraph Registry["get_embedder(uri)"]
        R[Embedder Registry]
    end
    subgraph Backends
        C[Cohere API\ncohere://]
        S[Sentence Transformers\nembedding_transformer://...]
        I[CLIP / ViT\nembedding_image://...]
    end
    subgraph Output
        V[Vectors]
    end
    T --> R
    Q --> R
    Im --> R
    R --> C
    R --> S
    R --> I
    C --> V
    S --> V
    I --> V

cohere:// — Remote API; requires COHERE_API_KEY. Used for text (indexing and search).
embedding_transformer://MODEL — Sentence Transformers (e.g. all-MiniLM-L6-v2); runs in-process. Text only. Optional HF_TOKEN for Hub rate limits.
embedding_image://MODEL — CLIP/ViT for image bytes; text embedders do not support images.

Web UI → API

The React UI runs in its own container (port 80 → 3000). The browser loads the UI and then sends requests to the API (port 8000). CORS is enabled on the API so the browser allows cross-origin requests.

flowchart TB
    subgraph Browser
        UI[React UI\nlocalhost:3000]
    end
    subgraph API_Server["API (localhost:8000)"]
        CORS[CORS middleware]
        Lib[POST/GET /libraries]
        Ingest[POST /libraries/.../ingest-pdf]
        Index[POST /libraries/.../index]
        Search[POST /libraries/.../search/by-query]
    end
    UI -->|fetch, JSON/form-data| CORS
    CORS --> Lib
    CORS --> Ingest
    CORS --> Index
    CORS --> Search

UI (React, nginx in container): lists libraries, opens library detail, uploads PDF, builds index, runs search by text.
API (FastAPI): serves REST endpoints; allow_origins=["*"] so any origin (e.g. http://localhost:3000) can call the API.
VITE_API_URL is set at UI build time so the frontend knows the API base URL (e.g. http://localhost:8000).

Search by query

The client sends text; the server embeds it with the chosen embedder, runs k-NN, then enriches results with chunk text and name for display.

sequenceDiagram
    participant Client
    participant API
    participant Registry
    participant Embedder
    participant Search
    participant ChunkRepo

    Client->>API: POST /libraries/{id}/search/by-query<br/>{ query, k, embedder }
    API->>Registry: get_embedder(embedder)
    Registry-->>API: embedder instance
    API->>Embedder: embed_queries([query])
    Embedder-->>API: query_embedding
    API->>Search: search(library_id, query_embedding, k)
    Search-->>API: [(chunk_id, distance), ...]
    loop For each result
        API->>ChunkRepo: get(chunk_id)
        ChunkRepo-->>API: chunk (text, name)
    end
    API-->>Client: { results: [{ chunk_id, distance, text, name }, ...] }

Same embedder (and dimension) as at indexing time must be used.
Chunk text and name are attached so the UI can show the matching snippet without extra calls.

Indexing algorithms

Three k-NN indexes are implemented (numpy only). Each library has one index; build via POST /libraries/{id}/index with algorithm: brute_force, kd_tree, or ivf.

flowchart TB
    subgraph Build["Build index"]
        V[(chunk_id, embedding)]
        V --> BF
        V --> KD
        V --> IVF
        BF["Brute-force<br/>Store all vectors"]
        KD["KD-Tree<br/>Median split, tree"]
        IVF["IVF<br/>K-means clusters"]
    end

    subgraph Query["Query: k-NN"]
        Q["query_embedding + k"]
        Q --> BFQ["Brute: scan all<br/>O(n·d)"]
        Q --> KDQ["KD-Tree: traverse tree<br/>O(log n) typical"]
        Q --> IVFQ["IVF: nearest clusters<br/>O(C·d + m·d) approx"]
        BFQ --> R[(chunk_id, distance)]
        KDQ --> R
        IVFQ --> R
    end

    BF --> BFQ
    KD --> KDQ
    IVF --> IVFQ

Algorithm	Build	Query	Space	Type
Brute-force	O(n·d)	O(n·d)	O(n·d)	Exact
KD-Tree	O(n log n·d)	O(log n) typ.	O(n·d)	Exact
IVF	O(n·d·C·I)	O(C·d + m·d)	O(n·d)	Approx.

n = vectors, d = dimension, C = centroids, I = k-means iterations, m = points in probed clusters.

Other extras

PDF ingest: POST /libraries/{id}/ingest-pdf — upload PDF; server extracts text (pypdf), chunks, embeds (Cohere or Sentence Transformers), creates one document and its chunks.
Scripts: smoke_test.py, index_file.py, index_pdf.py, search_query.py, inspect_library.py for CLI testing (see scripts/README.md).

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
app		app
frontend		frontend
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
env.example		env.example
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vector Database REST API

Objective

Definitions

What the API does

Data flow

How it was implemented

1. Chunk, Document, Library (Pydantic, fixed schema)

2. Indexing algorithms (no external vector DB libraries)

3. Concurrency (no data races)

4. CRUD logic (Services)

5. API layer

6. Docker image

Running the project

Constraints respected

Tech stack

Extras (optional, documented here)

Embedders (Cohere, Sentence Transformers, Image)

Web UI → API

Search by query

Indexing algorithms

Other extras

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vector Database REST API

Objective

Definitions

What the API does

Data flow

How it was implemented

1. Chunk, Document, Library (Pydantic, fixed schema)

2. Indexing algorithms (no external vector DB libraries)

3. Concurrency (no data races)

4. CRUD logic (Services)

5. API layer

6. Docker image

Running the project

Constraints respected

Tech stack

Extras (optional, documented here)

Embedders (Cohere, Sentence Transformers, Image)

Web UI → API

Search by query

Indexing algorithms

Other extras

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages