ContextLab 🧪

An open-source LLM context engineering toolkit for analyzing, compressing, and visualizing prompt contexts.

Developed independently by Siddhant Khare; not affiliated with Ona.

What is ContextLab?

ContextLab gives developers visibility and control over their LLM prompt contexts through:

Analysis: Token usage, redundancy detection, TF-IDF salience scoring
Compression: Pluggable strategies (deduplication, summarization, sliding windows, hybrid)
Optimization: Budget planning to maximize relevance under token limits
Visualization: Interactive embedding scatters, token timelines, and dashboards

Architecture Overview

┌─────────────┐
│   Input     │  (files, globs, JSONL, stdin)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Tokenizer  │  (tiktoken/sentencepiece wrapper)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│   Chunker   │  (configurable size/overlap)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│ Embeddings  │  (OpenAI/pluggable providers)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Analysis   │  (redundancy, salience, stats)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│ Compression │  (dedup/summarize/window/hybrid)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Budget     │  (optimize subset under limit)
│  Planner    │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Storage    │  (SQLite + JSONL)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│     CLI     │  Web UI │  REST API
└─────────────┘

Features

🔍 Analysis

Chunk documents with configurable size/overlap
Compute token counts per model (GPT-4, Claude, Llama, etc.)
Calculate TF-IDF-style salience scores
Detect redundancy via embedding cosine similarity
Persist results to SQLite with JSONL exports

🗜️ Compression

Deduplication: Remove near-duplicate chunks
Summarization: LLM-based or extractive summaries
Sliding Window: Keep only recent context
Hybrid: Combine strategies for optimal results

💰 Budget Planning

Greedy baseline optimizer
Maximize relevance under token limits
Deterministic output (seeded randomness)

📊 Visualization

Embedding scatter plots (UMAP-reduced, colored by redundancy)
Token timeline (stacked area: kept/dropped)
Interactive web dashboard
CLI visualization (tables and charts)

Quick Start

Installation

# Install from PyPI
pip install contextlab

# Or install from source
git clone https://github.com/siddhant-k-code/contextlab.git
cd contextlab
pip install -e .

CLI Usage

# Analyze documents
contextlab analyze docs/*.md --model gpt-4o-mini --out .contextlab

# Compress with strategy
contextlab compress .contextlab/<run_id> --strategy hybrid --limit 8000

# Visualize results
contextlab viz .contextlab/<run_id>

Python SDK

from contextlab import analyze, compress, optimize

# Analyze documents
report = analyze(
    paths=["docs/*.md"],
    model="gpt-4o-mini",
    chunk_size=512,
    overlap=50
)

# Optimize under budget
plan = optimize(
    report,
    limit=8000,
    strategy="hybrid"
)

print(f"Compressed {report.total_tokens} → {plan.final_tokens} tokens")
print(f"Kept {len(plan.kept_chunks)}/{len(report.chunks)} chunks")

REST API

# Start API server
make api
# or
uvicorn api.main:app --reload

# Analyze via API
curl -X POST http://localhost:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "Your context here", "model": "gpt-4o-mini"}'

Web UI

# Start development server
cd web
npm install
npm run dev

Visit http://localhost:5173 to see the dashboard.

Configuration

Create a .env file:

# OpenAI API (for embeddings and summarization)
OPENAI_API_KEY=sk-...

# Optional: use different embedding model
CONTEXTLAB_EMBEDDING_MODEL=text-embedding-3-small

# Optional: custom storage path
CONTEXTLAB_STORAGE_PATH=./.contextlab

Development

Setup

make init          # Install dependencies and pre-commit hooks
make test          # Run tests
make lint          # Run Ruff linter
make type          # Run mypy type checker
make all           # Lint + type + test

Running Examples

./scripts/run_example.sh

Docker

# Build and run API
make docker-api

# Build and run Web UI
make docker-web

# Run full demo
make demo

Testing

# Run all tests with coverage
pytest --cov=contextlab --cov-report=html

# Run specific test modules
pytest tests/test_tokenizer.py
pytest tests/test_compressors.py

Coverage target: 80%+ on core logic (analyze, compress, budget modules).

Documentation

Full documentation is available at docs/ or via mkdocs:

mkdocs serve

Roadmap

v0.1.0 (MVP) ✅

Core analysis, compression, and budget planning
CLI, Python SDK, REST API
Web UI with visualizations
SQLite + JSONL storage

v0.2.0 (Planned)

Benchmark harness for accuracy vs compression
Auto-tuning retriever parameters
Plugin API for custom compressors
Export format compatible with eval tools
Token-based API authentication
Telemetry opt-in

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

Acknowledgments

Built with FastAPI, SvelteKit, and tiktoken
Inspired by research in context compression and prompt optimization
Not affiliated with Ona Systems or any commercial entity

Questions? Open an issue or reach out on GitHub Discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.devcontainer		.devcontainer
.github		.github
api		api
contextlab		contextlab
docker		docker
docs		docs
examples		examples
scripts		scripts
tests		tests
web		web
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE		LICENSE
Makefile		Makefile
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
SECURITY.md		SECURITY.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContextLab 🧪

What is ContextLab?

Architecture Overview

Features

🔍 Analysis

🗜️ Compression

💰 Budget Planning

📊 Visualization

Quick Start

Installation

CLI Usage

Python SDK

REST API

Web UI

Configuration

Development

Setup

Running Examples

Docker

Testing

Documentation

Roadmap

v0.1.0 (MVP) ✅

v0.2.0 (Planned)

Contributing

License

Acknowledgments

About

Uh oh!

Languages

License

Siddhant-K-code/ContextLab

Folders and files

Latest commit

History

Repository files navigation

ContextLab 🧪

What is ContextLab?

Architecture Overview

Features

🔍 Analysis

🗜️ Compression

💰 Budget Planning

📊 Visualization

Quick Start

Installation

CLI Usage

Python SDK

REST API

Web UI

Configuration

Development

Setup

Running Examples

Docker

Testing

Documentation

Roadmap

v0.1.0 (MVP) ✅

v0.2.0 (Planned)

Contributing

License

Acknowledgments

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Languages