A comprehensive benchmark suite for comparing Amazon S3 Vectors against FAISS, NMSLib, and brute-force search methods at scale (10K to 10M vectors).
This project provides a complete framework for benchmarking vector similarity search performance across different methods:
- Amazon S3 Vectors - AWS managed vector database
- FAISS - Facebook AI Similarity Search (HNSW index)
- NMSLib - Non-Metric Space Library (HNSW index)
- Brute-force - Baseline cosine similarity search
The benchmark evaluates:
- Query latency across different vector counts
- Search accuracy (Recall@K) using UKBench dataset
- Scalability from 10K to 10M vectors
- Memory efficiency and resource usage
- 🚀 Multiple Vector Databases: Support for S3 Vectors, FAISS, NMSLib, and brute-force
- 📊 Comprehensive Metrics: Query latency, recall, precision, and more
- 📈 Visualization: Automatic chart generation for results analysis
- 🔄 Resume Capability: Checkpoint support for long-running benchmarks
- 💾 Embedding Caching: Efficient storage and retrieval of embeddings
- 🎯 UKBench Dataset: Standard evaluation dataset with ground truth
- ⚙️ Configurable: YAML-based configuration for all parameters
- Python: 3.9 or higher
- uv: Fast Python package installer (recommended) or pip as fallback
- AWS Account: For S3 Vectors testing
- Storage: Sufficient disk space for datasets (~10-50 GB)
- Memory: 8 GB+ RAM recommended
- GPU (optional): For faster embedding generation
git clone https://github.com/Siddhant-K-code/s3-vectors-benchmark.git
cd s3-vectors-benchmark# macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or with pip
pip install uvOption A: Using uv sync (recommended)
# Install project and dependencies (creates .venv automatically)
uv sync
# Or with dev dependencies for testing
uv sync --devOption B: Manual virtual environment
# Create virtual environment
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install project in editable mode with dependencies
uv pip install -e .
# Or install with dev dependencies
uv pip install -e ".[dev]"aws configureOr set environment variables:
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1cp config.yaml.example config.yaml
# Edit config.yaml with your settingsRequired settings in config.yaml:
- AWS region and S3 bucket name
- Dataset cache directories
- Benchmark parameters
uv run python src/main.py prepare-data --dataset ukbench --download
uv run python src/main.py prepare-data --dataset coco --downloaduv run python src/main.py generate-embeddings --model vit-s --dimension 384This will:
- Load images from UKBench dataset
- Generate embeddings using DINOv2-small model
- Cache embeddings to HDF5 file
uv run python src/main.py benchmark \
--embeddings data/embeddings/vit-s/embeddings_384d.h5 \
--vectors 10200 100000 1000000 \
--methods s3_vectors faiss nmslib \
--quickThe --quick flag runs a smaller test with fewer queries.
uv run python src/main.py visualize --latestThis generates three charts:
processing_time_ratio.png- Processing time normalized to smallest datasetsearch_accuracy.png- Recall@K across different vector countsprocessing_time_ms.png- Query latency in milliseconds (S3 Vectors)
Download and prepare datasets:
# UKBench only
uv run python src/main.py prepare-data --dataset ukbench --download
# COCO only
uv run python src/main.py prepare-data --dataset coco --download
# All datasets
uv run python src/main.py prepare-data --dataset all --downloadGenerate embeddings with different models:
# DINOv2-small (384-dim)
uv run python src/main.py generate-embeddings --model vit-s
# DINOv2-base (768-dim)
uv run python src/main.py generate-embeddings --model vit-b
# DINOv2-large (1024-dim)
uv run python src/main.py generate-embeddings --model vit-lRun full benchmark suite:
uv run python src/main.py benchmark \
--embeddings data/embeddings/vit-s/embeddings_384d.h5 \
--vectors 10200 100000 500000 1000000 10000000 \
--dimensions 384 \
--methods s3_vectors faiss nmslib bruteforce \
--output results/full_benchmark.jsonOptions:
--vectors: Vector counts to test (multiple values)--dimensions: Vector dimensions to test--methods: Methods to benchmark (s3_vectors, faiss, nmslib, bruteforce)--quick: Quick test with fewer queries--dry-run: Estimate time/resources without running
Test S3 Vectors connection and basic operations:
uv run python src/main.py test-s3 \
--embeddings data/embeddings/vit-s/embeddings_384d.h5 \
--vectors 1000000 \
--dimension 384Generate charts from results:
# Use latest results file
uv run python src/main.py visualize --latest
# Specify results file
uv run python src/main.py visualize \
--results-file results/benchmark_results_20250102_120000.json \
--output-dir results/chartss3-vectors-benchmark/
├── README.md # This file
├── pyproject.toml # Modern Python package configuration (uv)
├── requirements.txt # Legacy pip requirements (for reference)
├── setup.py # Package setup (legacy)
├── config.yaml.example # Configuration template
├── .gitignore # Git ignore rules
├── .python-version # Python version specification
├── src/ # Source code
│ ├── __init__.py
│ ├── main.py # CLI entry point
│ ├── config.py # Configuration management
│ ├── data_loader.py # Dataset loading
│ ├── embeddings.py # Embedding generation
│ ├── benchmark.py # Benchmark orchestration
│ ├── evaluate.py # Accuracy evaluation
│ ├── visualize.py # Chart generation
│ ├── utils.py # Utility functions
│ └── vector_dbs/ # Vector database implementations
│ ├── base.py # Abstract base class
│ ├── s3_vectors.py # S3 Vectors implementation
│ ├── faiss_db.py # FAISS implementation
│ ├── nmslib_db.py # NMSLib implementation
│ └── bruteforce.py # Brute-force baseline
├── tests/ # Test suite
├── notebooks/ # Jupyter notebooks
├── docs/ # Documentation
├── data/ # Datasets (git-ignored)
└── results/ # Results (git-ignored)
Edit config.yaml to customize:
aws:
region: us-east-1
profile: default
bucket_name: your-vector-bucket-names3_vectors:
index_name: benchmark-index
metric_type: cosine # or euclidean
batch_size: 500benchmark:
vector_counts: [10200, 100000, 500000, 1000000, 10000000]
dimensions: [384, 768, 1024]
topk: 5
num_queries: 100
num_repeats: 3After running benchmarks, results are saved to JSON files with:
- Raw measurements: Query latency, result IDs, similarities
- Evaluation metrics: Recall@K, Precision@K, aggregated statistics
- Metadata: Configuration, timestamps, vector counts
Results are stored in JSON format:
import json
with open("results/benchmark_results_20250102.json", "r") as f:
results = json.load(f)
# Access evaluation metrics
evaluation = results["evaluation"]
for config_key, metrics in evaluation.items():
print(f"{config_key}:")
print(f" Recall@K: {metrics['recall_at_k']['mean']:.3f}")
print(f" Query time: {metrics['query_time_ms']['mean']:.2f} ms")Charts are automatically generated showing:
- Processing Time Ratio: Normalized query latency across methods
- Search Accuracy: Recall@K across vector counts
- Processing Time (ms): S3 Vectors query latency
# Verify credentials
aws sts get-caller-identity
# Or set environment variables
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...Ensure the bucket exists and is accessible:
aws s3 ls s3://your-bucket-nameFor large datasets:
- Reduce
batch_sizein embeddings config - Process in smaller chunks
- Use GPU for embedding generation
Some datasets may require manual download. Check:
- Network connectivity
- Sufficient disk space
- Dataset availability
If uv is not found:
- Ensure it's in your PATH
- Restart your terminal after installation
- Use
pip install uvas fallback
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests (run with
uv run pytest) - Submit a pull request
MIT License - see LICENSE file for details
If you use this benchmark in your research, please cite:
@software{s3_vectors_benchmark,
title={S3 Vectors Benchmark},
author={Siddhant Khare},
year={2025},
url={https://github.com/siddhant-k-code/s3-vectors-benchmark}
}- UKBench dataset: http://vis.uky.edu/~stewe/ukbench/
- Microsoft COCO dataset: https://cocodataset.org/
- DINOv2 models: https://github.com/facebookresearch/dinov2
- FAISS: https://github.com/facebookresearch/faiss
- NMSLib: https://github.com/nmslib/nmslib
For issues and questions:
- Open an issue on GitHub
- Check documentation in
docs/directory - Review example notebooks in
notebooks/