Skip to content

Lazy-Racoon/vectordbapi

Repository files navigation

Vector Database REST API

REST API to index and query documents in a vector database: k-Nearest Neighbor (kNN) search over embeddings. The API is containerized with Docker (or Podman).


Objective

Develop a REST API that allows users to index and query their documents within a Vector Database. The API is containerized in a Docker container.

Definitions

  1. Chunk: A piece of text with an associated embedding and metadata.
  2. Document: Made of multiple chunks and metadata.
  3. Library: Made of a list of documents and metadata.

What the API does

  1. CRUD libraries — create, read, update, delete.
  2. CRUD documents and chunks within a library.
  3. Index the contents of a library.
  4. k-Nearest Neighbor vector search over the selected library with a given embedding query.

Data flow

flowchart LR
    subgraph Client
        A[HTTP Client]
    end
    subgraph API
        B[FastAPI routes]
    end
    subgraph Services
        C[LibraryService]
        D[DocumentService]
        E[ChunkService]
        F[SearchService]
        G[IndexService]
    end
    subgraph Data
        H[(Repositories)]
        I[(Index Registry)]
    end
    A --> B
    B --> C
    B --> D
    B --> E
    B --> F
    B --> G
    C --> H
    D --> H
    E --> H
    F --> H
    F --> I
    G --> H
    G --> I
Loading
  • Request → API (routes) → Services (business logic) → Repositories (in-memory store) and Index Registry (kNN indexes).
  • Reads use a shared read lock; writes use an exclusive write lock so there are no data races.

How it was implemented

1. Chunk, Document, Library (Pydantic, fixed schema)

  • Pydantic models with a fixed schema: no user-defined metadata fields.
  • Chunk: id, text, embedding (list of floats), created_at, name (optional), document_id.
  • Document: id, name, created_at, chunk_ids, library_id.
  • Library: id, name, created_at, document_ids.
  • All IDs are UUIDs generated by the application.

2. Indexing algorithms (no external vector DB libraries)

  • Brute-force: Linear scan; compare query to every vector. Build: O(n·d), Query: O(n·d), Space: O(n·d). Baseline, exact k-NN.
  • KD-Tree: Tree over vectors (median split). Build: O(n log n · d), Query: O(log n) typical, Space: O(n·d). Exact k-NN; good for moderate dimensions.
  • IVF: K-means clusters; search in nearest cluster(s). Build: O(n·d·C·I), Query: O(C·d + m·d), Space: O(n·d). Approximate k-NN for larger n.

Only numpy is used for math (norms, etc.). No chroma-db, pinecone, FAISS, etc.

3. Concurrency (no data races)

  • Reader-writer lock: Reads (get, list, search) hold a shared read lock; writes (create, update, delete, build index) hold an exclusive write lock.
  • Design: one lock around repositories and index registry; simple and correct for a single-process, in-memory API.

4. CRUD logic (Services)

  • Services implement the logic: Library, Document, Chunk, Search, Index.
  • Repositories do CRUD only (in-memory); Services orchestrate them and keep relationships consistent (e.g. document_ids on Library, chunk_ids on Document, cascade deletes).

5. API layer

  • FastAPI routes call Services; routes are thin (parse request, call service, map to HTTP status).
  • Status codes: 200, 201, 204, 404, 409, 422 via fastapi.status (no hardcoded numbers).
  • REST: POST/GET/PUT/DELETE for libraries, documents, chunks; POST /libraries/{id}/index to build index; POST /libraries/{id}/search with body {"embedding": [...], "k": N} for k-NN search.

6. Docker image

  • Dockerfile (multi-stage): build with uv, runtime with uvicorn. Works with Docker and Podman.
  • docker-compose (or podman-compose): one command to run the API (and optional UI). No need to run anything on the host except the containers.

Running the project

Prerequisites: Docker or Podman, and (optional) uv for local dev/tests.

# From project root
docker compose up
# or: podman compose up

Optional: copy env.example to .env and set COHERE_API_KEY if you use Cohere for embeddings. The assignment suggested a Cohere API key for creating embeddings for tests; manually created chunks suffice to test the system.

Tests (in container):

docker build --target test -t vector-db-api-test .
docker run --rm vector-db-api-test
# or with podman

Constraints respected

  • No chroma-db, pinecone, FAISS, or similar; indexing algorithms are implemented with numpy only.
  • No document processing pipeline required; manually created chunks are enough to test the system.

Tech stack

  • API backend: Python + FastAPI + Pydantic
  • Dependency management: uv (pyproject.toml, uv sync)
  • Containers: Dockerfile + docker-compose (Podman-compatible)

Extras (optional, documented here)

The following were not required by the task; they are implemented and documented below with Mermaid diagrams.


Embedders (Cohere, Sentence Transformers, Image)

Embeddings are resolved by URI. The registry chooses the backend; invalid or missing HF_TOKEN does not break the flow (fallback to unauthenticated Hub access).

flowchart LR
    subgraph Input
        T[Text chunks]
        Q[Query text]
        Im[Images]
    end
    subgraph Registry["get_embedder(uri)"]
        R[Embedder Registry]
    end
    subgraph Backends
        C[Cohere API\ncohere://]
        S[Sentence Transformers\nembedding_transformer://...]
        I[CLIP / ViT\nembedding_image://...]
    end
    subgraph Output
        V[Vectors]
    end
    T --> R
    Q --> R
    Im --> R
    R --> C
    R --> S
    R --> I
    C --> V
    S --> V
    I --> V
Loading
  • cohere:// — Remote API; requires COHERE_API_KEY. Used for text (indexing and search).
  • embedding_transformer://MODEL — Sentence Transformers (e.g. all-MiniLM-L6-v2); runs in-process. Text only. Optional HF_TOKEN for Hub rate limits.
  • embedding_image://MODEL — CLIP/ViT for image bytes; text embedders do not support images.

Web UI → API

The React UI runs in its own container (port 80 → 3000). The browser loads the UI and then sends requests to the API (port 8000). CORS is enabled on the API so the browser allows cross-origin requests.

flowchart TB
    subgraph Browser
        UI[React UI\nlocalhost:3000]
    end
    subgraph API_Server["API (localhost:8000)"]
        CORS[CORS middleware]
        Lib[POST/GET /libraries]
        Ingest[POST /libraries/.../ingest-pdf]
        Index[POST /libraries/.../index]
        Search[POST /libraries/.../search/by-query]
    end
    UI -->|fetch, JSON/form-data| CORS
    CORS --> Lib
    CORS --> Ingest
    CORS --> Index
    CORS --> Search
Loading
  • UI (React, nginx in container): lists libraries, opens library detail, uploads PDF, builds index, runs search by text.
  • API (FastAPI): serves REST endpoints; allow_origins=["*"] so any origin (e.g. http://localhost:3000) can call the API.
  • VITE_API_URL is set at UI build time so the frontend knows the API base URL (e.g. http://localhost:8000).

Search by query

The client sends text; the server embeds it with the chosen embedder, runs k-NN, then enriches results with chunk text and name for display.

sequenceDiagram
    participant Client
    participant API
    participant Registry
    participant Embedder
    participant Search
    participant ChunkRepo

    Client->>API: POST /libraries/{id}/search/by-query<br/>{ query, k, embedder }
    API->>Registry: get_embedder(embedder)
    Registry-->>API: embedder instance
    API->>Embedder: embed_queries([query])
    Embedder-->>API: query_embedding
    API->>Search: search(library_id, query_embedding, k)
    Search-->>API: [(chunk_id, distance), ...]
    loop For each result
        API->>ChunkRepo: get(chunk_id)
        ChunkRepo-->>API: chunk (text, name)
    end
    API-->>Client: { results: [{ chunk_id, distance, text, name }, ...] }
Loading
  • Same embedder (and dimension) as at indexing time must be used.
  • Chunk text and name are attached so the UI can show the matching snippet without extra calls.

Indexing algorithms

Three k-NN indexes are implemented (numpy only). Each library has one index; build via POST /libraries/{id}/index with algorithm: brute_force, kd_tree, or ivf.

flowchart TB
    subgraph Build["Build index"]
        V[(chunk_id, embedding)]
        V --> BF
        V --> KD
        V --> IVF
        BF["Brute-force<br/>Store all vectors"]
        KD["KD-Tree<br/>Median split, tree"]
        IVF["IVF<br/>K-means clusters"]
    end

    subgraph Query["Query: k-NN"]
        Q["query_embedding + k"]
        Q --> BFQ["Brute: scan all<br/>O(n·d)"]
        Q --> KDQ["KD-Tree: traverse tree<br/>O(log n) typical"]
        Q --> IVFQ["IVF: nearest clusters<br/>O(C·d + m·d) approx"]
        BFQ --> R[(chunk_id, distance)]
        KDQ --> R
        IVFQ --> R
    end

    BF --> BFQ
    KD --> KDQ
    IVF --> IVFQ
Loading
Algorithm Build Query Space Type
Brute-force O(n·d) O(n·d) O(n·d) Exact
KD-Tree O(n log n·d) O(log n) typ. O(n·d) Exact
IVF O(n·d·C·I) O(C·d + m·d) O(n·d) Approx.

n = vectors, d = dimension, C = centroids, I = k-means iterations, m = points in probed clusters.


Other extras

  • PDF ingest: POST /libraries/{id}/ingest-pdf — upload PDF; server extracts text (pypdf), chunks, embeds (Cohere or Sentence Transformers), creates one document and its chunks.
  • Scripts: smoke_test.py, index_file.py, index_pdf.py, search_query.py, inspect_library.py for CLI testing (see scripts/README.md).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors