Query the contents of images with ClipQuery.
This is an interface to query CLIP models easily using open_clip models.
Made for educational porpoises🐬. You should probably use clip-retrieval for heavy duty stuff.
It's easy as
cq = ClipQuery()
images = cq.encode_images(["dog.jpg", "cat.jpg"])
scores = cq.query(images, "a picture of a dog")
# >>> scores = [25, -1]
# higher score means better match per imageClipQuery will automagically (I have sinned) detect GPU and use it if available.
Example
Check out example.ipynb for a walkthrough usecase using imagenette image data.
Check out cql_example.ipynb for a walkthrough usecase using imagenette image data.
given your data frame df, add another column image_encoding
cql = CQL(df)
df["image_encoding"] = cql.encode_images(df["id"], base_path="./data/imagenette")Query concepts in the df dataframe by name directly with SQL syntax and the clip function. See that we also reference the image_encoding column.
SELECT *, clip(image_encoding, 'a picture of cute puppy dogs') as puppy_concept FROM df
WHERE label = 'English Springer Spaniel'
ORDER BY puppy_concept DESCin python,
puppy_springer_spaniels = cql(
"""SELECT *, clip(image_encoding, 'a picture of cute puppy dogs') as puppy_concept FROM df
WHERE label = 'English Springer Spaniel'
ORDER BY puppy_concept DESC
"""
)The code is 100 lines! Cmon, are you this lazy? Check out clip_query.py for the code.
- add support for many text queries (i.e., classification task)
- Have a standard SQL API so you can do complex querying with CLIP and your own data.
