FEAT: Define GCG extension protocols (typing surface only) by romanlutz · Pull Request #1861 · microsoft/PyRIT

romanlutz · 2026-05-30T04:05:50Z

Summary

Adds pyrit/auxiliary_attacks/gcg/extension_protocols.py containing four runtime_checkable Protocol classes that mark the algorithmic seams in the GCG optimization loop where a future caller may substitute custom behavior:

SamplingStrategy — how candidate suffix token sequences are proposed from the gradient. Current implementation: top-k by -grad, uniform pick within top-k (GCGPromptManager.sample_control).
LossFunction — how candidate suffixes are scored against the target. Current implementation: weighted cross-entropy on target + control slices.
CandidateFilter — how proposed candidates get pruned before evaluation. Current implementation: drops candidates whose decoded string re-tokenizes to a different token count (MultiPromptAttack.get_filtered_cands).
SuffixInitializer — how the initial suffix string is constructed. Current implementation: literal string from GCGAlgorithmConfig.control_init.

Each protocol has a Google-style docstring with Args: / Returns: and a References: section pointing at the symbols in gcg_attack.py / attack_manager.py it abstracts.

What this PR is not

Pure typing surface — zero behavior change, zero wiring. No concrete implementations of the protocols, no new fields on GCGAlgorithmConfig, no dispatch in GCGMultiPromptAttack. The protocols are exposed so users can implement them; consuming them is left for follow-up work.

Design notes

Protocol parameter names use fully spelled-out tokens (control_tokens, non_ascii_tokens, top_k, temperature, token_ids) rather than mirroring the legacy abbreviations.
The module uses from __future__ import annotations plus a TYPE_CHECKING import for torch so it imports cleanly on installs that only have the base dev extra (no torch), preserving the invariant introduced by commit 36aaaa31 from FEAT: GCG public API - GCG + GCGConfig + ExperimentalWarning, shifts module to experimental status #1792.
All four protocols are re-exported from pyrit.auxiliary_attacks.gcg via the existing PEP 562 _LAZY_IMPORTS pathway so the public surface stays consistent with how GCG / GCGGenerator / GCGContext / GCGResult are exposed.
LossFunction owns its entire loss computation (criterion choice, slicing, and any weighted combination of target/control terms). The current target_weight / control_weight knobs on GCGAlgorithmConfig keep working unchanged.

Drive-by cleanups (in response to review feedback)

GCGAlgorithmConfig.temp is widened from int = 1 to float = 1.0 and the same change is propagated to GCGPromptManager.sample_control, GCGMultiPromptAttack.step, and MultiPromptAttack.run — the other two strategy run overloads were already float, so this also resolves a pre-existing inconsistency. Module is experimental, no deprecation cycle.
Seven Sphinx reST cross-reference roles (:class:, :meth:, :func:) in config.py are replaced with plain double-backtick code spans, since PyRIT renders docstrings with MyST and the check-no-rest-roles pre-commit hook now blocks them.

Tests

New tests/unit/auxiliary_attacks/gcg/test_extension_protocols.py with 19 parametrized test instances covering:

Module __all__ contents.
Package re-export identity (each name imported from the package root is the same object as the one in extension_protocols).
Each protocol is @runtime_checkable.
Each protocol accepts a minimal in-test concrete implementation via isinstance(impl, ProtocolName).
A class missing every method fails the isinstance check for each protocol (catches accidental signature drift in future PRs).
One return-shape smoke test per protocol with a trivial stub implementation.

Gated with pytest.importorskip("torch") since the stubs construct real torch.Tensor arguments for the shape assertions.

Full GCG unit suite still passes: 133/133 in tests/unit/auxiliary_attacks/gcg/.

Adds pyrit/auxiliary_attacks/gcg/extension_protocols.py containing four runtime_checkable Protocol classes that mark the algorithmic seams in the GCG optimization loop where a future caller may substitute custom behavior: - SamplingStrategy.sample_candidates — abstracts GCGPromptManager.sample_control - LossFunction.compute_loss — abstracts the weighted target/control CE - CandidateFilter.filter_candidates — abstracts MultiPromptAttack.get_filtered_cands - SuffixInitializer.make_initial_suffix — abstracts the literal control_init plumbing This PR is pure typing surface: no concrete implementations, no defaults, no wiring into GCGAlgorithmConfig or GCGMultiPromptAttack. The default implementations (extracted byte-for-byte from current attack code with a parity gate) and the optional config fields that select between defaults and custom impls land in follow-up PRs. The module uses `from __future__ import annotations` plus a TYPE_CHECKING import for torch so it imports cleanly on the base `dev` extra (no torch), preserving the invariant added by commit 36aaaa3 in Sub-PR A. All four Protocols are re-exported from pyrit.auxiliary_attacks.gcg via the existing PEP 562 _LAZY_IMPORTS pathway so the public surface is consistent with how GCG / GCGGenerator / GCGContext / GCGResult are exposed. Tests cover module `__all__`, package re-export identity, runtime_checkable positive and negative isinstance, and a return-shape smoke test per protocol with a trivial in-test stub implementation. `pytest.importorskip("torch")` gates the whole file because the stubs construct real `torch.Tensor` arguments for the shape assertions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Renames in pyrit/auxiliary_attacks/gcg/extension_protocols.py and the corresponding test stubs: control_toks -> control_tokens candidate_toks -> candidate_tokens nonascii_toks -> non_ascii_tokens (mirrors allow_non_ascii) topk -> top_k temp -> temperature control_len -> control_length (docstring shape annotations) The legacy GCGAlgorithmConfig fields (topk, temp) and the legacy attack code (GCGPromptManager.sample_control, get_filtered_cands) keep their existing names. Renaming those is a separate API change that belongs in the B3 wiring PR (where GCGAlgorithmConfig is extended anyway). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Completes the parameter-name spell-out pass on the four extension protocols (previous commit handled SamplingStrategy / CandidateFilter). `ids` is a common ML shorthand but `token_ids` is unambiguous and consistent with the other tokens-* parameters in the same module. Descriptive uses of the word `ids` in surrounding docstring prose are left as-is since they read naturally. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…on-protocols

@rlundeen2

Per @rlundeen2's review on PR microsoft#1861: 1. Replace `References:` blocks that cited line ranges in `gcg_attack.py` / `attack_manager.py` with symbol-only references. Line numbers drift the moment the legacy attack code is touched (B3 wiring will do exactly that); symbol names are stable across the refactors that follow. 2. Re-type `SamplingStrategy.sample_candidates(temperature=)` as `float` instead of `int`. The protocol is a brand-new surface and was previously mirroring the legacy `GCGAlgorithmConfig.temp: int = 1` field for no good reason — sampling temperatures are conceptually continuous. The legacy field stays as-is; B3 wiring owns deciding whether to widen it or coerce at the boundary. The stub used by the runtime-checkable tests is updated to match, and the shape-smoke test now passes `temperature=1.0`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

GCGAlgorithmConfig.temp goes from `int = 1` to `float = 1.0`. The matching parameter on the three downstream methods that still typed it as `int` is widened too: - GCGPromptManager.sample_control - GCGMultiPromptAttack.step - MultiPromptAttack.run The other two strategy `run` overloads (ProgressiveMultiPromptAttack, IndividualPromptAttack) were already `float = 1.0` — the pre-existing inconsistency is now resolved. Sampling temperature is conceptually continuous; typing it as int in a brand-new public-API field made no sense. The module is experimental, no deprecation cycle owed. Also updates the SamplingStrategy protocol docstring to drop the stale "kept for API compatibility with the legacy code path" framing in favour of a description of why the parameter exists (the default sampler ignores it, but custom strategies that want softmax weighting receive it). While here, replace seven Sphinx reST cross-reference roles (`:class:...`, `:meth:...`, `:func:...`) in `config.py` with plain double-backtick code spans. PyRIT renders docstrings with MyST, not Sphinx — these roles show up as raw literal text in the built docs and are now blocked by the `check-no-rest-roles` pre-commit hook. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…on-protocols

The `check-no-rest-roles` pre-commit hook blocks `:class:Foo` patterns; PyRIT renders docstrings with MyST, not Sphinx, so those roles appear as raw literal text in the built docs. Two `:class:...` roles in the module-level docstring (`GCG`, `GCGGenerator`, `PromptGeneratorStrategy`) are replaced with plain double-backtick code spans per the documented convention. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz and others added 4 commits May 29, 2026 16:44

Merge remote-tracking branch 'origin/main' into romanlutz/gcg-extensi…

4b135da

…on-protocols

rlundeen2 self-assigned this Jun 2, 2026

rlundeen2 reviewed Jun 2, 2026

View reviewed changes

Comment thread pyrit/auxiliary_attacks/gcg/extension_protocols.py Outdated

rlundeen2 reviewed Jun 2, 2026

View reviewed changes

Comment thread tests/unit/auxiliary_attacks/gcg/test_extension_protocols.py Outdated

rlundeen2 approved these changes Jun 2, 2026

View reviewed changes

romanlutz and others added 4 commits June 2, 2026 15:18

Merge remote-tracking branch 'origin/main' into romanlutz/gcg-extensi…

9ccc7b4

…on-protocols

romanlutz enabled auto-merge June 2, 2026 23:04

romanlutz added this pull request to the merge queue Jun 2, 2026

Merged via the queue into microsoft:main with commit f58a218 Jun 2, 2026
52 checks passed

romanlutz deleted the romanlutz/gcg-extension-protocols branch June 2, 2026 23:34

romanlutz mentioned this pull request Jun 3, 2026

FEAT: Add default implementations of GCG extension protocols #1902

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Define GCG extension protocols (typing surface only)#1861

FEAT: Define GCG extension protocols (typing surface only)#1861
romanlutz merged 8 commits into
microsoft:mainfrom
romanlutz:romanlutz/gcg-extension-protocols

romanlutz commented May 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

romanlutz commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this PR is not

Design notes

Drive-by cleanups (in response to review feedback)

Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

romanlutz commented May 30, 2026 •

edited

Loading