FEAT: Better Scenario Tracking by rlundeen2 · Pull Request #1758 · microsoft/PyRIT

rlundeen2 · 2026-05-19T05:26:09Z

Previously, if a Scenario was interrupted mid-AtomicAttack, completed AttackResults persisted to the DB became orphaned because the scenario-to-attack-result link only lived in a JSON manifest (attack_results_json) written after the whole AtomicAttack returned. On resume, those objectives re-executed wastefully.

This change makes scenario linkage a first-class column on AttackResultEntry. It allows resume to use more completed results. It also allows for progress to be tracked better.

New columns: scenario_result_id (indexed FK, ON DELETE SET NULL) and scenario_data (JSON with fixed schema {atomic_attack_name, objective_index}).
New ExecutionAttribution dataclass in pyrit/executor/attack/core/ (so the executor never imports from the scenario layer) is set on AttackContext by AttackExecutor per-task before scheduling, and read by the default attack event handler when persisting.
Hydration in get_scenario_results uses the FK with a merge-mode fallback to the legacy manifest for partially-migrated DBs.
Resume uses objective_index (deterministic, parallel-safe; derived from seed_groups input_indices) rather than objective text, so duplicate objective text doesn't collapse two seed groups.
Drops the unreleased error_attack_result_ids_json column outright; error AttackResults are now linkable via get_attack_results(scenario_result_id=..., outcome=ERROR).
attack_results_json stays write-through this release for downgrade safety; future releases will stop populating and then drop.
update_scenario_run_state becomes a targeted UPDATE rather than a full row rebuild (so it doesn't clobber the manifest during the deprecation window)

Previously, if a Scenario was interrupted mid-AtomicAttack (Ctrl-C, OOM, crash), completed AttackResults persisted to the DB became orphaned because the scenario-to-attack-result link only lived in a JSON manifest (attack_results_json) written after the whole AtomicAttack returned. On resume, those objectives re-executed wastefully. This change makes scenario linkage a first-class column on AttackResultEntry: - New columns: scenario_result_id (indexed FK, ON DELETE SET NULL) and scenario_data (JSON with fixed schema {atomic_attack_name, objective_index}). - New ExecutionAttribution dataclass in pyrit/executor/attack/core/ (so the executor never imports from the scenario layer) is set on AttackContext by AttackExecutor per-task before scheduling, and read by the default attack event handler when persisting. - Hydration in get_scenario_results uses the FK with a merge-mode fallback to the legacy manifest for partially-migrated DBs. - Resume uses objective_index (deterministic, parallel-safe; derived from seed_groups input_indices) rather than objective text, so duplicate objective text doesn't collapse two seed groups. - Drops the unreleased error_attack_result_ids_json column outright; error AttackResults are now linkable via get_attack_results(scenario_result_id=..., outcome=ERROR). - attack_results_json stays write-through this release for downgrade safety; future releases will stop populating and then drop. - update_scenario_run_state becomes a targeted UPDATE rather than a full row rebuild (so it doesn't clobber the manifest during the deprecation window). Includes Alembic migration with idempotent backfill, scenario_data round-trip on AttackResultEntry, and tests for: event-handler attribution stamping, executor attribution propagation at max_concurrency>1, FK + manifest + mixed hydration paths, migration backfill correctness/idempotency/downgrade, interruption-recovery regression, duplicate-objective-text resume safety, and duplicate atomic_attack_name validation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…lify hydration - Delete dishonest no-op add_attack_results_to_scenario shim. - Standardize on print_deprecation_message (drop ad-hoc warnings.warn). Style guide gains a concise Deprecations section with the `removed_in = current minor + 2` rule. - Remove stale per-scenario state left over from the manifest era: AttackContext._error_attack_result_id, _StrategyRuntimeError.error_attack_result_id, Scenario._result_lock, Scenario._original_objectives_map, and the stray `import asyncio`. Replace defensive `getattr(context, '_attribution', None)` with direct attribute access — the contract is mandatory. - Rename ExecutionAttribution -> ScenarioExecutionAttribution (and the module file) to match its scenario-specific schema. - Refactor MemoryInterface.get_scenario_results: split into _build_scenario_result_query_conditions, _query_scenario_result_entries, _hydrate_scenario_attack_results. The hydrator now issues a single batched IN-query on AttackResultEntry.scenario_result_id (fixes the previous N+1) and drops the legacy attack_results_json manifest fallback entirely — the FK is the sole source of truth. - Narrow _stamp_attribution(result=) to AttackResult to satisfy ty. - Update affected tests; rewrite the four hydration tests that incidentally relied on the manifest fallback to use the production FK write path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Bumps pyrit/scenario/core/atomic_attack.py coverage from 37% to 94% by exercising the three resume-critical surfaces the PR introduced or changed but that had no dedicated tests: - TestAtomicAttackFilterSeedGroupsByIndices: stable-identity filter that drops completed seeds while preserving each survivor's original index across successive filter calls. - TestAtomicAttackFilterSeedGroupsByObjectives: keeps the deprecated legacy path under test and asserts the DeprecationWarning fires until removed_in=0.16.0. - TestAtomicAttackAttributionFactory: the closure built in run_async when _scenario_result_id is set — no factory outside a Scenario, factory maps input_index -> original objective_index after filtering, and the snapshot is taken by value so post-call mutations cannot poison in-flight attributions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Decouple the attack persistence path from scenario vocabulary. The attack layer now ships an opaque attribution dataclass (parent_id, parent_collection, position) — the scenario layer interprets those fields to mean (scenario_result_id, atomic_attack_name, objective_index). - ScenarioExecutionAttribution -> AttackResultAttribution (renamed module and class) - AttackResult.scenario_result_id / scenario_data -> attribution_parent_id / attribution_data - AttackResultEntry columns, index, and foreign key constraint renamed; migration 9c8b7a6d5e4f rewritten in place (still unreleased on this branch) - Replaced FK abbreviation with foreign key / ForeignKey in comments and docstrings The DB foreign key still targets ScenarioResultEntries.id; that is a relational fact, not a layering violation. The attack layer has no scenario-specific identifiers in its type signatures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…5_18_scenario_resume # Conflicts: # pyrit/memory/memory_models.py # pyrit/scenario/core/atomic_attack.py # tests/unit/models/test_scenario_result.py

…lter

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hannahwestra25

just small nits! looks good

…cation shim Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Adds a skip_cached: bool = False constructor parameter and a thin _get_atomic_attacks_async override on AdversarialBenchmark that, when enabled, filters out atomic-attack candidates whose (atomic_attack_name, technique_eval_hash) tuple appears in any prior COMPLETED ScenarioResult for the same scenario name + VERSION with outcome SUCCESS or FAILURE. ERROR and UNDETERMINED outcomes always retry. Caching is off by default to preserve existing behavior. Built on the AttackResultAttribution primitives introduced in microsoft#1758: - AtomicAttack.technique_eval_hash provides the candidate side of the cache key (content-derived via AtomicAttackEvaluationIdentifier). - AttackResultEntry.attribution_data['parent_collection' + 'parent_eval_hash'] provides the persisted side; the executor stamps these per AttackResult, so two atomic attacks sharing a name but using different technique configurations don't cross-pollinate. Defensive behavior: - Missing attribution_data or missing parent_collection -> skip the row silently (treat as not-cached). - Memory exceptions from get_scenario_results / get_attack_results -> log a warning and fall back to no filtering. Caching becomes a no-op rather than blocking the run. - Scenarios in IN_PROGRESS / FAILED / CANCELLED state contribute nothing (no get_attack_results query made for them at all). - Scenario name is matched on type(self).__name__ (PascalCase "AdversarialBenchmark"), aligned with how ScenarioIdentifier stores it; VERSION filter ensures the VERSION bump in the previous commit invalidates old VERSION=1 results for cache purposes (they remain queryable; they just don't suppress fresh runs). Tests: 11 new unit tests (TestAdversarialBenchmarkSkipCachedFilter + TestAdversarialBenchmarkSkipCachedInit) covering filtering semantics, outcome filters, eval-hash disambiguation, scenario-state filter, query-arg shape, missing-attribution defense, memory-error defense, and constructor defaults. Integration test with full persistence round-trip is a separate follow-up commit (F6.3 per plan). Wider regression: 1649/1649 pass across scenario+setup+registry+ backend. Failure mode flagged for the PR description batch: - The override + helper are scenario-agnostic in shape and should probably live on base Scenario behind a duck-typed identity hook (e.g. cls.cache_scope_name() classmethod) so other scenarios (RapidResponse, Scam, etc.) can opt into skip_cached without copy-pasting the wrapper. Enhancement, not a bug; tracked as lift-skip-cached-to-base-scenario.

rlundeen2 commented May 19, 2026

View reviewed changes

Comment thread pyrit/memory/memory_interface.py Outdated

rlundeen2 marked this pull request as ready for review May 19, 2026 20:15

hannahwestra25 reviewed May 19, 2026

View reviewed changes

Comment thread pyrit/executor/attack/core/attack_strategy.py Outdated

hannahwestra25 reviewed May 19, 2026

View reviewed changes

Comment thread pyrit/scenario/core/scenario.py Outdated

rlundeen2 mentioned this pull request May 19, 2026

FEAT text adaptive scenario #1760

Merged

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/executor/attack/core/attack_executor.py Outdated

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/executor/attack/core/attack_result_attribution.py Outdated

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/executor/attack/core/attack_result_attribution.py Outdated

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/memory/alembic/versions/9c8b7a6d5e4f_add_attribution_to_attack_results.py

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/scenario/core/scenario.py Outdated

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/executor/attack/core/attack_strategy.py Outdated

rlundeen2 and others added 4 commits May 20, 2026 10:20

FEAT: hash-based scenario resume keys

547dff6

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

MAINT: address PR review nits

ead1d1c

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into users/rlundeen/2026_0…

36e4af3

…5_18_scenario_resume # Conflicts: # pyrit/memory/memory_models.py # pyrit/scenario/core/atomic_attack.py # tests/unit/models/test_scenario_result.py

Rename _hydrate_scenario_attack_results, drop deprecated objective fi…

452a383

…lter

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/executor/attack/core/attack_result_attribution.py Outdated

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/scenario/core/atomic_attack.py Outdated

hannahwestra25 reviewed May 20, 2026

View reviewed changes

Comment thread pyrit/scenario/core/atomic_attack.py

rlundeen2 and others added 2 commits May 20, 2026 14:01

scope scenario resume by technique eval hash

5d907b6

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

restore filter_seed_groups_by_objectives as deprecated shim

a0a5b71

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

hannahwestra25 approved these changes May 20, 2026

View reviewed changes

add tests for resume hash filter, eval-hash disambiguation, and depre…

e75bc0c

…cation shim Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

rlundeen2 added this pull request to the merge queue May 20, 2026

Merged via the queue into microsoft:main with commit 044b50f May 20, 2026
48 checks passed

rlundeen2 deleted the users/rlundeen/2026_05_18_scenario_resume branch May 20, 2026 22:39

ValbuenaVC mentioned this pull request May 21, 2026

FEAT: Adversarial Benchmark Scenario Refactor #1765

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Better Scenario Tracking#1758

FEAT: Better Scenario Tracking#1758
rlundeen2 merged 11 commits into
microsoft:mainfrom
rlundeen2:users/rlundeen/2026_05_18_scenario_resume

rlundeen2 commented May 19, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hannahwestra25 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rlundeen2 commented May 19, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hannahwestra25 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants