An open-source LLM context engineering toolkit for analyzing, compressing, and visualizing prompt contexts.
Developed independently by Siddhant Khare; not affiliated with Ona.
ContextLab gives developers visibility and control over their LLM prompt contexts through:
- Analysis: Token usage, redundancy detection, TF-IDF salience scoring
- Compression: Pluggable strategies (deduplication, summarization, sliding windows, hybrid)
- Optimization: Budget planning to maximize relevance under token limits
- Visualization: Interactive embedding scatters, token timelines, and dashboards
βββββββββββββββ
β Input β (files, globs, JSONL, stdin)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Tokenizer β (tiktoken/sentencepiece wrapper)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Chunker β (configurable size/overlap)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Embeddings β (OpenAI/pluggable providers)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Analysis β (redundancy, salience, stats)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Compression β (dedup/summarize/window/hybrid)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Budget β (optimize subset under limit)
β Planner β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β Storage β (SQLite + JSONL)
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββ
β CLI β Web UI β REST API
βββββββββββββββ
- Chunk documents with configurable size/overlap
- Compute token counts per model (GPT-4, Claude, Llama, etc.)
- Calculate TF-IDF-style salience scores
- Detect redundancy via embedding cosine similarity
- Persist results to SQLite with JSONL exports
- Deduplication: Remove near-duplicate chunks
- Summarization: LLM-based or extractive summaries
- Sliding Window: Keep only recent context
- Hybrid: Combine strategies for optimal results
- Greedy baseline optimizer
- Maximize relevance under token limits
- Deterministic output (seeded randomness)
- Embedding scatter plots (UMAP-reduced, colored by redundancy)
- Token timeline (stacked area: kept/dropped)
- Interactive web dashboard
- CLI visualization (tables and charts)
# Install from PyPI
pip install contextlab
# Or install from source
git clone https://github.com/siddhant-k-code/contextlab.git
cd contextlab
pip install -e .# Analyze documents
contextlab analyze docs/*.md --model gpt-4o-mini --out .contextlab
# Compress with strategy
contextlab compress .contextlab/<run_id> --strategy hybrid --limit 8000
# Visualize results
contextlab viz .contextlab/<run_id>from contextlab import analyze, compress, optimize
# Analyze documents
report = analyze(
paths=["docs/*.md"],
model="gpt-4o-mini",
chunk_size=512,
overlap=50
)
# Optimize under budget
plan = optimize(
report,
limit=8000,
strategy="hybrid"
)
print(f"Compressed {report.total_tokens} β {plan.final_tokens} tokens")
print(f"Kept {len(plan.kept_chunks)}/{len(report.chunks)} chunks")# Start API server
make api
# or
uvicorn api.main:app --reload# Analyze via API
curl -X POST http://localhost:8000/api/analyze \
-H "Content-Type: application/json" \
-d '{"text": "Your context here", "model": "gpt-4o-mini"}'# Start development server
cd web
npm install
npm run devVisit http://localhost:5173 to see the dashboard.
Create a .env file:
# OpenAI API (for embeddings and summarization)
OPENAI_API_KEY=sk-...
# Optional: use different embedding model
CONTEXTLAB_EMBEDDING_MODEL=text-embedding-3-small
# Optional: custom storage path
CONTEXTLAB_STORAGE_PATH=./.contextlabmake init # Install dependencies and pre-commit hooks
make test # Run tests
make lint # Run Ruff linter
make type # Run mypy type checker
make all # Lint + type + test./scripts/run_example.sh# Build and run API
make docker-api
# Build and run Web UI
make docker-web
# Run full demo
make demo# Run all tests with coverage
pytest --cov=contextlab --cov-report=html
# Run specific test modules
pytest tests/test_tokenizer.py
pytest tests/test_compressors.pyCoverage target: 80%+ on core logic (analyze, compress, budget modules).
Full documentation is available at docs/ or via mkdocs:
mkdocs serve- Tutorial 1: Introduction to Analysis
- Tutorial 2: Compression Strategies
- Tutorial 3: Visualization Dashboard
- Core analysis, compression, and budget planning
- CLI, Python SDK, REST API
- Web UI with visualizations
- SQLite + JSONL storage
- Benchmark harness for accuracy vs compression
- Auto-tuning retriever parameters
- Plugin API for custom compressors
- Export format compatible with eval tools
- Token-based API authentication
- Telemetry opt-in
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - Copyright (c) 2025 Siddhant Khare
- Built with FastAPI, SvelteKit, and tiktoken
- Inspired by research in context compression and prompt optimization
- Not affiliated with Ona Systems or any commercial entity
Questions? Open an issue or reach out on GitHub Discussions.