Skip to content

MAINT: Fix dataset metadata and references#1934

Merged
romanlutz merged 7 commits into
microsoft:mainfrom
romanlutz:romanlutz/refchecker-bib-fixes
Jun 5, 2026
Merged

MAINT: Fix dataset metadata and references#1934
romanlutz merged 7 commits into
microsoft:mainfrom
romanlutz:romanlutz/refchecker-bib-fixes

Conversation

@romanlutz

Copy link
Copy Markdown
Contributor

Description

This PR cleans up dataset attribution and citation metadata so built-in seed datasets expose source authors/groups consistently and references point at the versions of the datasets PyRIT actually loads.

Key changes:

  • Adds and corrects author/group metadata across remote seed dataset loaders.
  • Moves _AUTHORS and _GROUPS onto loader classes as class-level constants and updates seed creation to use self._AUTHORS / self._GROUPS.
  • Applies refchecker-driven fixes in doc/references.bib, including ATR attribution notes, the TDC23 competition note, canonical Crescendo arXiv URL, and the Trojan Source USENIX venue details.
  • Updates AILuminate to cite the v1.0 paper that matches the v1.0 demo prompt set currently loaded.
  • Corrects affiliation metadata for MIC, Multilingual Vulnerability, and Red Team Social Bias.

One related investigation: TDC23 was compared against HarmBench separately. HarmBench is not a superset of TDC23, so this PR does not remove TDC23.

Tests and Documentation

  • uv run academic-refchecker --paper doc/references.bib --report-file refchecker_report.json --debug
  • uv run pytest tests/unit/datasets/ -q
  • uv run ruff check pyrit/
  • uv run ruff check pyrit/datasets/seed_datasets/remote/ doc/code/datasets/1_loading_datasets.py
  • uv run ty check <changed PyRIT files>
  • Parsed doc/references.bib with bibtexparser (83 entries).

Docs updated:

  • doc/references.bib
  • doc/bibliography.md
  • doc/code/datasets/1_loading_datasets.py
  • doc/code/datasets/1_loading_datasets.ipynb

JupyText was not rerun because the notebook/source change was a small inline citation-key update applied to both synchronized files.

romanlutz and others added 6 commits June 3, 2026 01:52
- Add module-level _AUTHORS and _GROUPS constants (or self.GROUPS for class-attribute
  style) to 31 remote dataset loaders and thread them through SeedPrompt / SeedObjective
  constructors. groups now consistently captures the academic/institutional affiliations
  of the source paper authors, matching the Seed dataclass semantics.

- Fix 4 misuses of groups= that stored a per-row category instead of author affiliations
  (jbb_behaviors, sorry_bench, ccp_sensitive_prompts, red_team_social_bias). Categories
  moved to metadata; groups now holds the paper affiliations.

- Fix comic_jailbreak _AUTHORS: previous list had 5 unrelated names; corrected to match
  the bibtex entry (tan2026comicjailbreak): Rui Yang Tan, Yujia Hu, Roy Ka-Wei Lee.
  Update the corresponding test_fetch_dataset_authors assertion.

- xstest_dataset, vlguard_dataset: replace plain arxiv URL in docstring with proper
  [@rottger2023xstest] / [@zong2024vlguard] bibtex citation markers.

- agent_threat_rules_dataset: add groups=["ATR Project"] alongside existing authors.

- doc/references.bib: fix wang2025siuo title and booktitle to match the official
  NAACL 2025 publication ("Safe Inputs but Unsafe Output: Benchmarking Cross-modality
  Safety Alignment of Large Vision-Language Models", in Findings of NAACL 2025).
  Surfaced by ref-checking with markrussinovich/refchecker; other refchecker findings
  reviewed and confirmed as false positives from blog/HF page metadata or duplicate
  arxiv versions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Second-pass refchecker triage (academic-refchecker v3.0.141). Re-ran against doc/references.bib; the prior rate-limited Semantic Scholar errors (kingma2014adam, rottger2025msts) are now confirmed false positives. Below summarizes only the 4 real fixes; remaining flags are documented as false positives in the session notes.

Real fixes:

- atr2026: expanded note to explain why `ATR Community'' is retained as co-author even though Zenodo lists only Kuan-Hsin Lin (intentional upstream attribution).

- mazeika2023tdc: expanded note to clarify there is no canonical paper write-up (NeurIPS 2023 Trojan Detection Challenge competition; cited URL is the competition site).

- russinovich2024crescendo: switched primary url to the canonical arXiv (2404.01833); moved project site to note.

- boucher2023trojan: converted @misc to @inproceedings with proper booktitle `32nd USENIX Security Symposium (USENIX Security 23)'' and pages 6507-6524; arXiv preprint reference moved into note. Addresses refchecker venue-missing error.

Documented false positives (no edit applied):

- promptfoo2025ccp, robustintelligence2024bypass, embracethered2024unicode, embracethered2025sneakybits: page metadata returns site/org as author; bibtex correctly cites the human author.

- vantaylor2024socialbias: HF dataset page metadata; curated title and author (Simone Van Taylor / svannie678) are correct.

- gong2025figstep, wang2025siuo (year): arXiv preprint year vs conference publication year (AAAI 2025 / NAACL 2025); @inproceedings year is correct.

- adversaai2023universal: content-similarity heuristic; page matches citation.

- aakanksha2024multilingual: refchecker parser misses single-name first author `Aakanksha''; bibtex has all 7 authors correctly.

- kingma2014adam, rottger2025msts: previously rate-limited; fresh run verifies cleanly.

- ji2023beavertails: skipped per user instruction.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move _AUTHORS and _GROUPS from module-level globals onto the dataset loader classes across remote seed datasets. Update call sites to use self._AUTHORS/self._GROUPS so metadata follows the class-level constant convention.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update AILuminate to cite the v1.0 paper matching the v1.0 demo prompt set that the loader pulls. Correct MIC to use Meta AI Research, limit multilingual vulnerability groups to University of Sydney and Massey University, and avoid affiliating Simone Van Taylor with Humane Intelligence.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz added this pull request to the merge queue Jun 5, 2026
Merged via the queue into microsoft:main with commit 47ffa93 Jun 5, 2026
52 checks passed
@romanlutz romanlutz deleted the romanlutz/refchecker-bib-fixes branch June 5, 2026 13:08
romanlutz pushed a commit to romanlutz/PyRIT that referenced this pull request Jun 5, 2026
Picks up 3 commits from main: PR microsoft#1883 (keyword-only init enforcement), microsoft#1934 (dataset metadata), microsoft#1939 (noqa audit), plus microsoft#1900 (content-filter constants) and microsoft#1947 (notebook fixes).

Resolved 6 conflicts in lazy-import noqa/type:ignore annotations by taking main's cleaner version (the # type: ignore[ty:unresolved-import] entries are unused warnings once --extra all is installed).

Re-doing a previous merge attempt that aborted without setting MERGE_HEAD, so the prior merge commit was a single-parent commit that silently dropped origin/main's recent changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants