FEAT: Backfill class-level metadata for all remote seed datasets by romanlutz · Pull Request #1780 · microsoft/PyRIT

romanlutz · 2026-05-22T16:30:22Z

What

Adds class-level tags, size, and modalities to every remote seed dataset loader so they participate in SeedDatasetFilter discovery (e.g. SeedDatasetFilter(tags={"default"})). Before this change, only 5 of ~33 remote loaders declared metadata, so the others were silently skipped by metadata-driven filtering.

This is the follow-up to the review discussion on #1757 (cc @jsong468) where reviewer asked why some loaders declared these fields and others didn't. Answer: most predate the metadata schema and just hadn't been backfilled.

How

Pinned a canonical, advisory tag vocabulary as RECOMMENDED_TAGS in pyrit/datasets/seed_datasets/seed_metadata.py. Users can still set custom tags — the metadata parser does not enforce, but a new parametrized coverage test does.
Documented the 5-condition rule for the special default tag inline in seed_metadata.py:
1. Ungated — no HF token, API key, auth, or signup.
2. Citable — peer-reviewed paper / established benchmark.
3. Single-call — await loader.fetch_dataset_async() works with no manual setup.
4. Size >= medium (>=100 prompts).
5. Broadly applicable — not narrowly scoped to a vertical (medical, legal, cybersecurity). Cross-cutting axes like privacy, bias, multimodal, multilingual, refusal, and jailbreak DO count.
Walked every remote loader and assigned size / tags / modalities based on the loader's docstring, tests, and upstream dataset card. Added inline # N prompts comments next to each size so reviewers can verify the bucket choice locally.
Renamed the non-canonical multilingual_culture tag on _SGXSTestDataset to multilingual and dropped its default tag (SGXSTest is gated on HF).
Marked _ORBenchBaseDataset as should_register = False (it has no usable dataset_name) and explicitly opted the three OR-Bench leaf classes back in.
Added TestRemoteLoaderMetadataCoverage — a parametrized test that walks every concrete _RemoteDatasetLoader subclass via auto-registration and asserts: metadata is present, tags/size/modalities are non-empty, size is in SeedDatasetSizeCategory, and tags is a subset of RECOMMENDED_TAGS (catches future typos like multilingual_culture).

Class-level harm_categories is intentionally deferred — per-row SeedPrompt.harm_categories already labels individual prompts; picking a "broadest" class-level summary is a judgment call better made by domain owners in a focused follow-up.

Backfill table

Loader	size	modalities	tags	default?
`_AegisContentSafetyDataset`	huge	(text,)	{default, safety}	YES
`_AyaRedteamingDataset`	medium	(text,)	{safety, multilingual}	no — per-language ~few hundred
`_BabelscapeAlertDataset`	huge	(text,)	{default, safety, jailbreak}	YES
`_BeaverTailsDataset`	huge	(text,)	{default, safety}	YES
`_CBTBenchDataset`	medium	(text,)	{safety, medical}	no — vertical
`_CCPSensitivePromptsDataset`	small	(text,)	{safety, multilingual}	no
`_DarkBenchDataset`	medium	(text,)	{default, safety}	YES
`_EquityMedQADataset`	medium	(text,)	{safety, bias, medical}	no — vertical
`_ForbiddenQuestionsDataset`	medium	(text,)	{default, safety, jailbreak}	YES
`_HarmBenchMultimodalDataset`	small	(text, image)	{safety, jailbreak, multimodal}	no
`_HarmfulQADataset`	large	(text,)	{default, safety, jailbreak}	YES
`_JBBBehaviorsDataset`	small	(text,)	{safety, jailbreak}	no
`_LibrAIDoNotAnswerDataset`	medium	(text,)	{default, safety, refusal}	YES
`_LLMLatentAdversarialTrainingDataset`	large	(text,)	{default, safety, jailbreak}	YES
`_MedSafetyBenchDataset`	large	(text,)	{safety, medical}	no — vertical
`_MLCommonsAILuminateDataset`	large	(text,)	{default, safety}	YES
`_MultilingualVulnerabilityDataset`	medium	(text,)	{default, safety, multilingual}	YES
`_ORBench80KDataset`	huge	(text,)	{default, safety, refusal}	YES
`_ORBenchHardDataset`	large	(text,)	{default, safety, refusal}	YES
`_ORBenchToxicDataset`	large	(text,)	{default, safety, refusal}	YES
`_PKUSafeRLHFDataset`	huge	(text,)	{default, safety}	YES
`_PromptIntelDataset`	medium	(text,)	{safety, jailbreak, cybersecurity}	no — API key
`_RedTeamSocialBiasDataset`	small	(text,)	{safety, bias, multiturn}	no
`_SaladBenchDataset`	huge	(text,)	{default, safety, jailbreak}	YES
`_SimpleSafetyTestsDataset`	small	(text,)	{safety}	no
`_SorryBenchDataset`	large	(text,)	{safety, jailbreak, synthetic}	no — gated
`_SOSBenchDataset`	large	(text,)	{safety, medical, cybersecurity}	no — vertical
`_TDC23RedteamingDataset`	small	(text,)	{safety, jailbreak}	no
`_ToxicChatDataset`	huge	(text,)	{default, safety, multiturn}	YES
`_TransphobiaAwarenessDataset`	medium	(text,)	{default, safety, bias}	YES
`_VLGuardDataset`	large	(text, image)	{safety, multimodal}	no — gated
`_VLSUMultimodalDataset`	large	(text, image)	{default, safety, multimodal}	YES
`_XSTestDataset`	medium	(text,)	{default, safety, refusal}	YES
`_SGXSTestDataset` (fixed)	medium	(text,)	{safety, multilingual}	no — gated, was `{default, safety, multilingual_culture}`

Already-tagged (unchanged): _HarmBenchDataset, _ComicJailbreakDataset, _VisualLeakBenchDataset.

No-op

No public API changes
No runtime behavior changes
No changes to per-row SeedPrompt.harm_categories
The 4 already-tagged loaders are not migrated to immutable frozenset / tuple style (cosmetic — out of scope)

Discussion link

#1757

Adds class-level `tags`, `size`, and `modalities` to all remote seed dataset loaders so they participate in `SeedDatasetFilter` discovery. Pins a recommended tag vocabulary and the 5-condition rule for the special `default` tag in `seed_metadata.py` as a soft contract, and enforces it via a new parametrized coverage test in `test_seed_dataset_provider.py`. Also renames `_SGXSTestDataset`'s non-canonical `multilingual_culture` tag to `multilingual` and drops `default` (the dataset is gated), and gates `_ORBenchBaseDataset` from auto-registration since it is not a usable loader on its own. No runtime behavior or public API changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

behnam-o

couple of minor comments, but also looks good as is.

…uckets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…taset-metadata # Conflicts: # pyrit/datasets/seed_datasets/remote/comic_jailbreak_dataset.py # pyrit/datasets/seed_datasets/remote/harmbench_multimodal_dataset.py # pyrit/datasets/seed_datasets/remote/visual_leak_bench_dataset.py # pyrit/datasets/seed_datasets/remote/vlsu_multimodal_dataset.py

…taset-metadata

…threats Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…taset-metadata

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…taset-metadata # Conflicts: # pyrit/models/__init__.py

Brings in 3 new commits from main: - 648faa9 FEAT: Backfill class-level metadata for all remote seed datasets (microsoft#1780) - 092126d MAINT: Migrate AddImage/AddTextImage converter deprecations to print_deprecation_message (microsoft#1875) - 376e000 FEAT: Add TatweelConverter for Arabic kashida insertion (microsoft#1869) Conflict resolution (11 files): took main's version everywhere (`git checkout --theirs`), then re-ran `ruff check --fix` to re-apply the PEP 604 sweep to main's new code (~36 violations auto-fixed). Same hand-fix for the runtime `Optional[dict]` in `pyrit/models/message_piece.py` PlainSerializer `return_type` that ruff can't auto-rewrite. Verification: - ruff check pyrit/ tests/ doc/ - clean - ruff format --check - clean - pytest tests/unit -n 4 - 8977 passed, 5 skipped, 0 failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The merge from origin/main brought in TestDarkBenchDataset. test_fetch_dataset_with_custom_config (added in microsoft#1780), which constructed _DarkBenchDataset(split="test") and asserted the kwarg was forwarded to _fetch_from_huggingface_async. That contract no longer holds: this PR deprecates the dead split kwarg and hardcodes split="train" at the DarkBench call site, since upstream apart/darkbench publishes only the "train" split. Drop the deprecated kwarg from the constructor call and assert the hardcoded "train" literal that actually flows through. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

behnam-o approved these changes May 22, 2026

View reviewed changes

Comment thread pyrit/datasets/seed_datasets/remote/harmbench_multimodal_dataset.py Outdated

Comment thread pyrit/datasets/seed_datasets/remote/forbidden_questions_dataset.py Outdated

romanlutz and others added 3 commits May 22, 2026 13:20

Replace estimated seed counts with exact loaded counts and fix size b…

c24fae9

…uckets Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Introduce Modality enum and migrate seed dataset loaders

ef8b294

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

rlundeen2 assigned behnam-o May 28, 2026

romanlutz and others added 5 commits May 29, 2026 20:44

Merge remote-tracking branch 'origin/main' into romanlutz/backfill-da…

597b199

…taset-metadata

Backfill CoCoNot contrast tags and extend RECOMMENDED_TAGS for agent …

17349d0

…threats Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into romanlutz/backfill-da…

b89e112

…taset-metadata

Align MIC tags with RECOMMENDED_TAGS and add ethics tag

2a7d119

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into romanlutz/backfill-da…

351a537

…taset-metadata # Conflicts: # pyrit/models/__init__.py

rlundeen2 approved these changes Jun 1, 2026

View reviewed changes

romanlutz enabled auto-merge June 1, 2026 23:53

romanlutz added this pull request to the merge queue Jun 1, 2026

Merged via the queue into microsoft:main with commit 648faa9 Jun 2, 2026
80 of 81 checks passed

romanlutz deleted the romanlutz/backfill-dataset-metadata branch June 2, 2026 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Backfill class-level metadata for all remote seed datasets#1780

FEAT: Backfill class-level metadata for all remote seed datasets#1780
romanlutz merged 9 commits into
microsoft:mainfrom
romanlutz:romanlutz/backfill-dataset-metadata

romanlutz commented May 22, 2026

Uh oh!

behnam-o left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

romanlutz commented May 22, 2026

What

How

Backfill table

No-op

Discussion link

Uh oh!

behnam-o left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants