Skip to content

DOC: Auto-link symbol references in generated API docs#1823

Merged
romanlutz merged 7 commits into
microsoft:mainfrom
romanlutz:romanlutz/myst-cross-refs-audit
Jun 1, 2026
Merged

DOC: Auto-link symbol references in generated API docs#1823
romanlutz merged 7 commits into
microsoft:mainfrom
romanlutz:romanlutz/myst-cross-refs-audit

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

@romanlutz romanlutz commented May 28, 2026

Before / after

Same section — pyrit.registry.AttackTechniqueRegistry.build_factory_from_spec docstring — in the rendered API page. Before, every symbol reference is a dead pink code span; after, the PyRIT class names are underlined, clickable MyST links that jump straight to the relevant module page.

Before (origin/main):

before

After (this PR):

after

Why

When #1782 converted Sphinx reST roles (:class:, :func:, :meth:, ...) to plain double-backticks under jupyter-book 2's MyST renderer, all the symbol cross-references in our docstrings stopped being clickable. Plain backticks are still the right human-readable default - contributors shouldn't have to learn MyST link syntax - but the rendered API pages have been a sea of un-navigable inline code spans ever since.

This PR restores the cross-references at render time, so source stays clean.

What

1. Auto-linker in build_scripts/gen_api_md.py

  • Every class / function / method heading now gets an explicit, FQN-scoped MyST label, e.g. (api-pyrit_prompt_target-PromptTarget)=. This avoids ambiguity where short names like Scorer or Score collide across modules.
  • Built a global symbol index over the post-alias-resolution module tree (short name, Class.method, fully-qualified path).
  • New post-processor rewrites docstring text / parameter descriptions / returns / raises so backtick code spans whose contents unambiguously resolve to a PyRIT symbol become real MyST links - SeedPrompt -> [`SeedPrompt`](#api-pyrit_models-SeedPrompt).
  • The auto-linker handles bare names ( Scorer ), Class.method ( PromptTarget.send_prompt_async ), fully-qualified names ( pyrit.models.SeedPrompt ), tilde/dot-prefixed forms left over from reST conversions, and class-scoped method references (a bare send_prompt_async inside PromptTarget's docstring links to the right method).
  • Ambiguous short names are skipped - the doc stays as a plain code span. Fenced code blocks and existing MyST links are preserved verbatim.
  • The API index page now links each preview symbol directly to its anchor.

2. Cleaned up the 13 leftover reST roles that #1782 missed

In cli_helpers.py, scorer_metrics.py, pyrit_scan.py, tree_of_attacks.py. Replaced with plain double-backticks so the new auto-linker can take it from here.

3. Pre-commit guard

build_scripts/check_no_rest_roles.py + a new check-no-rest-roles local hook in .pre-commit-config.yaml reject any newly-introduced :class: / :func: / :meth: / :mod: / :attr: / :data: / :exc: / :obj: / :ref: / :py:*: role in pyrit/. Error message points contributors at the new convention.

4. Style guide refresh

.github/instructions/style-guide.instructions.md now describes the auto-linker, mentions the guard, and explains what gets auto-linked vs. left as plain code-spans.

Tests

24 new unit tests in tests/unit/build_scripts/:

  • test_gen_api_md.py (18 tests) covers anchor helpers, the symbol index (classes/functions/methods, private skipping, ambiguous duplicates), the rewriter (single/double backticks, Class.method, FQN, current-class context, ambiguous skip, unknown-name passthrough, fenced-block protection, idempotent existing-link handling, tilde/dot prefix, empty-text passthrough), the _process_docstring_text doctest-fence ordering, and render_function end-to-end (anchor emission, all four docstring fields rewritten, method-scoped anchors with current_class context).
  • test_check_no_rest_roles.py (6 tests) covers the pre-commit guard CLI.

All 8,096 unit tests pass. Pre-commit clean (ruff format, ruff check, ty type-check all green).

Validation

  • make docs-build (i.e. jupyter-book build --all --html --strict) succeeds.
  • Spot-checked rendered HTML: anchor labels show up as id="api-pyrit-prompt-target-prompttarget" and the cross-reference links resolve to the right headings (mystmd normalises labels to lowercase-kebab).
  • No new strict-mode warnings introduced; only pre-existing ones (heading-depth gaps, jupytext extra keys, gutter grid option, the legacy-target warning in hf_aml_model_endpoint_guide.md) remain.
  • 98 cross-reference links produced across all API pages on first run, more as we restore intent from docstrings without changing source.

Out of scope

romanlutz and others added 3 commits May 27, 2026 10:09
When PR microsoft#1782 converted Sphinx reST roles to plain double-backticks under
jupyter-book 2's MyST renderer, all the symbol cross-references in our
docstrings stopped being clickable.  This restores them without forcing
contributors to learn MyST link syntax.

Changes:
* �uild_scripts/gen_api_md.py now emits an explicit, FQN-scoped MyST
  label before every class, function, and method heading (e.g.
  (api-pyrit_prompt_target-PromptTarget)=) and post-processes every
  docstring text, parameter description, return description, and raises
  description: backtick code spans whose contents unambiguously resolve
  to a PyRIT class, function, or method become MyST links to the right
  anchor.  Ambiguous short names and fenced code blocks are left alone.
  The API index page now links each preview symbol to its anchor too.
* Cleaned up 13 leftover Sphinx reST roles in pyrit/ that PR microsoft#1782
  missed (cli_helpers, scorer_metrics, pyrit_scan, tree_of_attacks).
* Added �uild_scripts/check_no_rest_roles.py plus a pre-commit hook so
  newly introduced :class: / :func: / :meth: / etc. roles are
  rejected before landing.
* Updated the style guide to describe the auto-linker behaviour and
  point at the new pre-commit guard.
* 21 new unit tests in 	ests/unit/build_scripts/ cover the rewriter
  (single/double backticks, Class.method, FQN, current-class context,
  ambiguous skip, fenced-block protection, existing-link idempotency,
  tilde/dot prefix) and the pre-commit guard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two extra unit tests on the auto-linker integration path:

* `test_process_docstring_text_protects_doctest_examples` pins the
  order I had to swap mid-implementation: `_escape_docstring_examples`
  must run before `_rewrite_symbol_refs` so a known PyRIT symbol
  appearing inside a `>>>` doctest example stays as raw text (otherwise
  the code sample would render as broken markdown).

* `test_render_function_emits_anchor_and_links_docstring_fields` plus
  `test_render_function_uses_method_anchor_when_class_name_given` are
  end-to-end smoke tests on `render_function`: they assert the
  `(api-...)=` label is emitted with the right scoping (module vs.
  method) and that every docstring field (text, params, returns, raises)
  goes through the rewriter so a regression in any of those four code
  paths fails the build.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
romanlutz added a commit to romanlutz/PyRIT that referenced this pull request May 28, 2026
…icks

The `check-no-rest-roles` pre-commit hook added in microsoft#1823
flags four `:meth:model_dump` / `:meth:model_validate` references in
`pyrit/models/conversation_reference.py` and `pyrit/models/retry_event.py`
that landed via PR microsoft#1769 before the hook existed. Replace them with plain
double-backticks so the hook passes cleanly on this stacked branch and the
deprecation notices render as readable code spans under MyST instead of
literal `:meth:
ame` text.

`model_dump` / `model_validate` are Pydantic methods, not PyRIT API, so
the auto-linker leaves them as plain code spans (correct: there is nothing
in our docs to link them to).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread build_scripts/check_no_rest_roles.py Outdated
@hannahwestra25 hannahwestra25 self-assigned this May 28, 2026
@romanlutz romanlutz added this pull request to the merge queue Jun 1, 2026
Merged via the queue into microsoft:main with commit bf95fcb Jun 1, 2026
48 checks passed
@romanlutz romanlutz deleted the romanlutz/myst-cross-refs-audit branch June 1, 2026 15:25
romanlutz added a commit to romanlutz/PyRIT that referenced this pull request Jun 2, 2026
…kticks

The typing modernization sweep in this PR rewrites `Optional[str]` to
`str | None` in `pyrit/models/conversation_reference.py` and
`pyrit/models/retry_event.py`. That puts both files in the changed-files
set CI feeds to `pre-commit run --from-ref origin/main --to-ref HEAD`,
which surfaces four pre-existing `:meth:` reST roles (landed via microsoft#1769
before the `check-no-rest-roles` hook from microsoft#1823 existed).

Replace `:meth:model_dump` and `:meth:model_validate` with plain
double-backticks so the hook passes. `model_dump` / `model_validate`
are Pydantic methods, so the auto-linker in `build_scripts/gen_api_md.py`
correctly leaves them as plain code spans.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz linked an issue Jun 4, 2026 that may be closed by this pull request
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC Website for PyRIT

2 participants