High-performance Python bindings for libphash, a C library for perceptual image hashing.
libphash provides multiple algorithms to generate "perceptual hashes" of images. Unlike cryptographic hashes (like MD5 or SHA256), perceptual hashes change only slightly if the image is resized, compressed, or has minor color adjustments. This makes them ideal for finding duplicate or similar images.
- 64-bit Hashes (uint64):
ahash: Average Hashdhash: Difference Hashphash: Perceptual Hash (DCT based)whash: Wavelet Hashmhash: Median Hashcolor_hash: Packed 42-bit HSV-based color hash (compatible withimagehash.colorhash).
- Digest Hashes (Multi-byte):
bmh: Block Mean Hash (256-bit digest).color_moments_hash: Statistical color distribution digest (mean, variance, skewness, kurtosis).radial_hash: Rotation-invariant Fourier-Mellin transform digest.
- A C compiler (GCC/Clang or MSVC)
- Python 3.8 or higher
pip install python-libphash
# or using uv
uv add python-libphashgit clone --recursive https://github.com/yourusername/python-libphash.git
cd python-libphash
pip install .
# or using uv
uv pip install .You can quickly compute a hash from the command line after installation:
python -m libphash.utils --path photo.jpg --method phashfrom libphash import ImageContext, HashMethod, hamming_distance
# Use the context manager for automatic memory management
with ImageContext("photo.jpg") as ctx:
# Get standard 64-bit hashes
phash_val = ctx.phash
dhash_val = ctx.dhash
print(f"pHash: {phash_val:016x}")
print(f"dHash: {dhash_val:016x}")
# Compare two images
from libphash import compare_images
distance = compare_images("image1.jpg", "image2.jpg", method=HashMethod.PHASH)
print(f"Hamming Distance: {distance}")Fine-tune hashing algorithms for specific use cases. Note that hashes generated with different parameters are not comparable.
- Ultra-Fast Image Decoding:
libphashbundles high-performance decoders for JPEG, PNG, and WebP. It useslibjpeg-turbo(TurboJPEG API),libpng/spng, andlibwebpwith SIMD acceleration (SSE/NEON/AVX2). Image data is loaded viammap()for zero-copy I/O between the file system and the decoder.- Fallback: Automatically falls back to
stb_imagefor other formats or if bundled decoders are disabled.
- Fallback: Automatically falls back to
with ImageContext("photo.jpg") as ctx:
# pHash (DCT) resolution
ctx.set_phash_params(dct_size=32, reduction_size=8)
# Radial Hash precision
ctx.set_radial_params(projections=40, samples=128)
# Block-based hashes (BMH) grid resolution
ctx.set_block_params(block_size=16)
# Wavelet Hash (wHash) Mode: "fast" (default) or "full"
ctx.set_whash_mode("full")
# Custom Grayscale weights (R, G, B)
ctx.set_gray_weights(38, 75, 15)
print(f"Custom pHash: {ctx.phash:016x}")Algorithms like Radial Hash or Color Hash return a Digest object instead of a single integer.
with ImageContext("photo.jpg") as ctx:
digest = ctx.radial_hash
print(f"Digest size: {digest.size} bytes")
print(f"Raw data: {digest.data.hex()}")
# Comparing digests
with ImageContext("photo_v2.jpg") as ctx2:
digest2 = ctx2.radial_hash
# Hamming distance for bit-wise comparison
h_dist = digest.distance_hamming(digest2)
# L2 (Euclidean) distance for similarity
l2_dist = digest.distance_l2(digest2)The main class for loading images and computing hashes.
__init__(path=None, bytes_data=None): Load an image from a file path or memory.set_gamma(gamma: float): Set gamma correction.set_gray_weights(r, g, b): Set custom RGB weights for grayscale conversion.set_phash_params(dct_size, reduction_size): Configure pHash DCT resolution.set_radial_params(projections, samples): Configure Radial Hash precision.set_block_params(block_size): Configure BMH/mHash grid resolution.set_whash_mode(mode="fast"): Use "fast" (median) or "full" (ImageHash accurate 2D DWT).- Properties:
ahash,dhash,phash,whash,mhash(returnsint). - Properties:
bmh,color_hash,radial_hash(returnsDigest).
data: The rawbytesof the hash.size: Length of the hash in bytes.distance_hamming(other): Calculates bit-wise distance.distance_l2(other): Calculates Euclidean distance.
hamming_distance(h1: int, h2: int): Returns the number of differing bits between two 64-bit integers.ph_can_use_libjpeg(): ReturnsTrueiflibjpeg-turbois enabled.ph_can_use_libpng(): ReturnsTrueiflibpngorspngis enabled.get_hash(path, method): Quick way to get a hash without manual context management.compare_images(path1, path2, method): Returns the Hamming distance between two image files.
Since the core logic is implemented in C and uses SIMD-accelerated decoders (SSE4.2, AVX2, NEON), libphash is significantly faster than pure-Python alternatives.
- JPEG Decoding: ~2.0x–6.0x faster than Pillow (TurboJPEG API).
- PNG Decoding: ~1.3x faster than Pillow (spng/libpng).
- WebP Decoding: ~2.5x faster than Pillow (Native
libwebp). - Zero-Copy: Uses
mmap()to avoid kernel-user space copies.
| Algorithm | imagehash (Pillow) | libphash (Native) | Speedup |
|---|---|---|---|
| pHash (JPEG) | 0.4506s | 0.0667s | 6.76x |
| wHash (JPEG) | 3.2750s | 0.0650s | 50.39x |
| pHash (WebP) | 0.3298s | 0.1240s | 2.66x |
| wHash (WebP) | 2.0520s | 0.1197s | 17.14x |
This project is licensed under the MIT License - see the LICENSE file for details.