🔍 Semantic Search Benchmark

⚡ High-Performance Semantic Search Comparison Framework

Compare different embedding models side-by-side with real benchmarks and metrics

Quick Start • Results • Models • Customization

📊 What This Project Does

This is a complete benchmarking framework for comparing different semantic search embedding models. Instead of guessing which model works best for semantic search, get real data:

⚡ Speed Metrics: Document indexing time and query latency
🎯 Relevance Scores: Semantic distance measurements
📈 Comparative Analysis: Side-by-side performance tables
🧪 Real-World Testing: 20 diverse documents across 4 categories
🔄 Reproducible Results: Same test suite every run

🎯 Models Compared

Model	Size	Speed	Accuracy	Best For
🔵 Default	Lightweight	⚡⚡⚡	⭐⭐	Quick prototypes
🟢 MiniLM-L6-v2	38M	⚡⚡⚡⚡	⭐⭐⭐⭐	Production (Balanced)
🟣 MPNet-base-v2	109M	⚡⚡	⭐⭐⭐⭐⭐	High-accuracy search

📈 Benchmark Results

Performance Comparison

Testing: default
Documents added in: 3.573s
Avg query time: 0.4861s
Avg relevance distance: 0.8008 (lower is better)

Query                          Time (ms)    Distance
Sport-related query            640.50       0.9197
Tech-related query             455.72       0.8826
Finance-related query          434.48       0.6427
Health-related query           478.39       0.6865
Programming query              421.42       0.8725

─────────────────────────────────────────────────────────
Testing: all-MiniLM-L6-v2
Documents added in: 0.649s
Avg query time: 0.0313s
Avg relevance distance: 0.4004 (lower is better)

Query                          Time (ms)    Distance
Sport-related query            61.40        0.4599
Tech-related query             23.72        0.4413
Finance-related query          26.98        0.3214
Health-related query           26.36        0.3433
Programming query              18.19        0.4363

─────────────────────────────────────────────────────────
Testing: all-mpnet-base-v2
Documents added in: 1.396s
Avg query time: 0.1359s
Avg relevance distance: 0.4164 (lower is better)

Query                          Time (ms)    Distance
Sport-related query            156.71       0.5530
Tech-related query             140.52       0.4529
Finance-related query          160.53       0.3032
Health-related query           104.48       0.3088
Programming query              117.06       0.4640

🏆 Summary Comparison

Model	Add Time	Query Time	Relevance	Speed Rank	Accuracy Rank
Default	3.57s	486ms	0.8008	🥉	🥉
MiniLM-L6-v2	0.65s	31ms	0.4004	🥇	🥇
MPNet-base-v2	1.40s	136ms	0.4164	🥈	🥈

🎖️ Winners

Category	Winner	Metric
⚡ Fastest Document Indexing	all-MiniLM-L6-v2	0.649s
🚀 Fastest Query Time	all-MiniLM-L6-v2	31ms avg
🎯 Best Relevance Score	all-MiniLM-L6-v2	0.4004 distance

🚀 Quick Start

📦 Installation

pip install -r requirements.txt

▶️ Run Benchmark

python benchmark.py

Expected output: ~2-3 minutes (first run downloads embedding models)

📂 Project Structure

semantic-search-benchmark/
├── benchmark.py              # Main benchmark runner
├── benchmark_data.py         # Dataset & test queries
├── test.py                   # Basic in-memory example
├── test_v2.py               # Persistent storage example
├── test_v3.py               # Semantic search demo
├── requirements.txt         # Dependencies
├── .gitignore              # Git ignore rules
└── README.md               # This file

🔍 Test Dataset

📚 20 Documents Across 4 Categories

⚽ Sports (5 docs)

Ronaldo scored an incredible goal last night
Messi won the World Cup with Argentina
Real Madrid won the Champions League final
Liverpool defeated Manchester City
Nadal won his 14th tennis grand slam

💻 Technology (5 docs)

Python is a great programming language for beginners
Machine learning is used in stock price prediction
TensorFlow and PyTorch are popular deep learning frameworks
Artificial intelligence is revolutionizing software development
Cloud computing provides scalable infrastructure

💰 Finance (5 docs)

The stock market crashed badly this week
Interest rates are rising due to inflation
Bitcoin reached a new all-time high
The Federal Reserve raised interest rates
Real estate prices continue to rise in major cities

🏥 Health (5 docs)

Regular exercise improves cardiovascular health
COVID-19 vaccines have saved millions of lives
Mental health awareness is becoming increasingly important
Healthy diet and sleep patterns prevent chronic diseases
Meditation reduces stress and anxiety

🎯 Test Queries (5 Total)

⚽ "football player scored a goal"
🧠 "machine learning and artificial intelligence"
📊 "money and market crash"
💪 "exercise and health"
🖥️ "programming languages and software"

📊 Key Insights

💡 Learning Points

Model	Speed	Accuracy	Use Case
Default	Very Fast	Low	Quick tests, low-stakes
MiniLM	⚡ Fastest	⭐⭐⭐⭐ High	RECOMMENDED - Most scenarios
MPNet	Slower	🎯 Highest	Accuracy critical tasks

🎓 Recommendations

🏃 Need speed? → Use MiniLM (balanced winner)
🎯 Need accuracy? → Use MPNet (but slower)
🚀 Production ready? → Use MiniLM (best all-around)
🧪 Just testing? → Use Default (fastest to setup)

🔧 Customization

➕ Add More Embedding Models

Edit benchmark.py:

models_to_test = [
    ("default", "./benchmark_db_default"),
    ("all-MiniLM-L6-v2", "./benchmark_db_minilm"),
    ("all-mpnet-base-v2", "./benchmark_db_mpnet"),
    ("your-model-name", "./benchmark_db_custom"),  # Add here
]

➕ Add More Test Queries

Edit benchmark_data.py:

QUERY_TESTS = [
    {
        "query": "your query here",
        "expected_category": "category",
        "description": "Your description"
    },
]

➕ Modify Test Documents

Edit benchmark_data.py DOCUMENTS list to test different domains.

📚 Resources & References

Resource	Link	Description
📖 ChromaDB Docs	trychroma.com	Official documentation
🤗 Sentence Transformers	sbert.net	Embedding models & guide
🏆 Model Leaderboard	huggingface.co/spaces/mteb	Compare all models
🔬 MTEB Benchmark	github.com/embeddings-benchmark	Industry benchmarks

🛠️ Requirements

Python 3.8+
chromadb >= 0.4.0
sentence-transformers >= 2.2.0
numpy >= 1.21.0

📝 Files Overview

File	Purpose
`benchmark.py`	🎯 Core benchmarking logic & runner
`benchmark_data.py`	📚 Test dataset & queries
`test.py`	📖 Basic embedding example
`test_v2.py`	💾 Persistent storage demonstration
`test_v3.py`	🔍 Semantic search demo
`requirements.txt`	📦 Python dependencies
`.gitignore`	🚫 Git exclusion rules

🚦 Getting Started

Step 1: Install Dependencies

pip install -r requirements.txt

Step 2: Run Benchmark

python benchmark.py

Step 3: Review Results

Check terminal output for detailed metrics
Compare models in the summary table
Identify winner for your use case

Step 4: Customize (Optional)

Add your own documents to benchmark_data.py
Test additional embedding models
Modify query tests for your domain

📊 Performance Characteristics

⚡ Speed Profile

Default: Instant setup, moderate query speed
MiniLM: Best performance (fastest overall)
MPNet: Higher latency, best accuracy

🎯 Accuracy Profile

Default: Basic semantic understanding
MiniLM: Good understanding of document relationships
MPNet: Excellent semantic comprehension

💾 Memory Usage

Default: Minimal footprint
MiniLM: ~384MB loaded (38M params)
MPNet: ~1.2GB loaded (109M params)

🤝 Contributing

Have improvements? Found a bug? Want to add:

More embedding models for comparison?
Different benchmark datasets?
Additional metrics?

Feel free to fork and submit pull requests!

📄 License

MIT License - Feel free to use this in your projects!

⭐ If This Helped You, Consider Giving It a Star! ⭐

Built with ❤️ for semantic search enthusiasts

Made to compare, benchmark, and optimize ChromaDB embedding models

Questions? Open an issue or check the docs 📚

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
.gitignore		.gitignore
README.md		README.md
benchmark.py		benchmark.py
benchmark_data.py		benchmark_data.py
requirements.txt		requirements.txt
test.py		test.py
test_v2.py		test_v2.py
test_v3.py		test_v3.py

Folders and files

Latest commit

History

Repository files navigation

🔍 Semantic Search Benchmark

⚡ High-Performance Semantic Search Comparison Framework

📊 What This Project Does

🎯 Models Compared

📈 Benchmark Results

Performance Comparison

🏆 Summary Comparison

🎖️ Winners

🚀 Quick Start

📦 Installation

▶️ Run Benchmark

📂 Project Structure

🔍 Test Dataset

📚 20 Documents Across 4 Categories

🎯 Test Queries (5 Total)

📊 Key Insights

💡 Learning Points

🎓 Recommendations

🔧 Customization

➕ Add More Embedding Models

➕ Add More Test Queries

➕ Modify Test Documents

📚 Resources & References

🛠️ Requirements

📝 Files Overview

🚦 Getting Started

Step 1: Install Dependencies

Step 2: Run Benchmark

Step 3: Review Results

Step 4: Customize (Optional)

📊 Performance Characteristics

⚡ Speed Profile

🎯 Accuracy Profile

💾 Memory Usage

🤝 Contributing

📄 License

⭐ If This Helped You, Consider Giving It a Star! ⭐

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages