feat(vision-metrics): split img_edit_score by davidberenstein1957 · Pull Request #651 · PrunaAI/pruna

davidberenstein1957 · 2026-04-28T13:04:13Z

Summary

Splits img_edit_score into its own stacked PR, adds ImageEditScoreMetric, and wires ImgEdit benchmark entry with clamping regression coverage.

This PR also carries benchmark-paper alignment cleanup from the umbrella work while preserving compatibility:

keeps text_to_image task_type literal behavior
introduces TASK_TYPE_* constants for readability
removes private-reference style notes

Stack Position

Base: PR feat(vision-metrics): split vie_score #650 (feat/vlm-pr-4b-vie-score)
Next: PR feat(e2e-tests): stacked e2e after split metrics #641 (feat/vlm-pr-5-e2e-tests)
Final integration: PR feat(e2e-tests): stacked e2e after split metrics #641 (feat/vlm-pr-5-e2e-tests)
Canonical umbrella reference: PR feat(evaluation): add VLMMetrics #545 (feat/metrics-vlm-support)

Files

src/pruna/evaluation/metrics/metric_img_edit_score.py
src/pruna/evaluation/benchmarks.py
tests/evaluation/test_vision_metrics.py

Test Plan

uv run pytest tests/evaluation/test_vision_metrics.py -k img_edit_score

Review Focus

ImgEdit score clamping behavior
Benchmark metadata/docs alignment without task_type breaking changes

Review Flow (Order)

Review the stack in this exact order:

feat(vendor): add LLM2Vec embedding model #637 vendor
feat(infrastructure): add VLM base classes and utilities #638 infrastructure
feat(text-metrics): split qa_accuracy #645 qa_accuracy
feat(text-metrics): split oneig_alignment #646 oneig_alignment
feat(text-metrics): split text_score pair #647 text_score pair
feat(text-metrics): split oneig_reasoning #648 oneig_reasoning
feat(vision-metrics): split vqa #649 vqa
feat(vision-metrics): split vie_score #650 vie_score
feat(vision-metrics): split img_edit_score #651 img_edit_score
feat(e2e-tests): stacked e2e after split metrics #641 e2e tests

This PR in the flow (9/10)

Review after PR feat(vision-metrics): split vie_score #650.
Next PR to review: feat(e2e-tests): stacked e2e after split metrics #641.
Confirm this PR's tests and scope before continuing.

github-actions · 2026-05-19T00:29:23Z

This PR has been inactive for 10 days and is now marked as stale.

github-actions · 2026-06-30T00:30:30Z

This PR has been inactive for 10 days and is now marked as stale. It will be closed in 7 days if there is no further activity.

Co-authored-by: Cursor <cursoragent@cursor.com>

- sync OneIG subset dataset loaders for benchmark registration - ruff check/format on changed VLM src files

This was referenced Apr 28, 2026

feat(text-metrics): add text-based VLM judge metrics #639

Closed

feat(vision-metrics): add vision-based VLM judge metrics #640

Closed

davidberenstein1957 force-pushed the feat/vlm-pr-4b-vie-score branch from 4c9b3d3 to 693f888 Compare May 8, 2026 09:01

davidberenstein1957 force-pushed the feat/vlm-pr-4c-img-edit-score branch from c6f1166 to f4a489b Compare May 8, 2026 09:01

github-actions Bot added the stale label May 19, 2026

davidberenstein1957 force-pushed the feat/vlm-pr-4b-vie-score branch from 693f888 to 01406d1 Compare June 2, 2026 17:30

davidberenstein1957 force-pushed the feat/vlm-pr-4c-img-edit-score branch from f4a489b to 2713f3d Compare June 2, 2026 17:30

github-actions Bot removed the stale label Jun 19, 2026

github-actions Bot added the stale label Jun 30, 2026

davidberenstein1957 force-pushed the feat/vlm-pr-4b-vie-score branch from 52f8cbc to eecabcf Compare July 2, 2026 13:25

feat(vision-metrics): split img_edit_score into dedicated branch

42d9254

Co-authored-by: Cursor <cursoragent@cursor.com>

davidberenstein1957 force-pushed the feat/vlm-pr-4b-vie-score branch from eecabcf to 0beeaba Compare July 2, 2026 13:51

davidberenstein1957 force-pushed the feat/vlm-pr-4c-img-edit-score branch from 70dcdbf to 42d9254 Compare July 2, 2026 13:51

fix(ci): lint, format, and benchmark registration

21cd903

- sync OneIG subset dataset loaders for benchmark registration - ruff check/format on changed VLM src files

github-actions Bot removed the stale label Jul 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vision-metrics): split img_edit_score#651

feat(vision-metrics): split img_edit_score#651
davidberenstein1957 wants to merge 2 commits into
feat/vlm-pr-4b-vie-scorefrom
feat/vlm-pr-4c-img-edit-score

davidberenstein1957 commented Apr 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

davidberenstein1957 commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Stack Position

Files

Test Plan

Review Focus

Review Flow (Order)

This PR in the flow (9/10)

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

davidberenstein1957 commented Apr 28, 2026 •

edited

Loading