Skip to content

FIX: Integration test fixes#1897

Merged
ValbuenaVC merged 9 commits into
microsoft:mainfrom
ValbuenaVC:integration_test_fixes
Jun 3, 2026
Merged

FIX: Integration test fixes#1897
ValbuenaVC merged 9 commits into
microsoft:mainfrom
ValbuenaVC:integration_test_fixes

Conversation

@ValbuenaVC
Copy link
Copy Markdown
Contributor

@ValbuenaVC ValbuenaVC commented Jun 2, 2026

Fixes to failing integration tests.

Description

  • test_translation_converter_exponential_backoff_timing — MockPromptTarget crash before retry loop (tests/integration/mocks.py): A recent refactor tightened MessagePiece.labels to non-optional dict[str, Any]. The integration mock's set_system_prompt was passing labels=None directly to MessagePiece(...), causing Pydantic validation to fail before send_prompt_async was ever called. Fixed with a one-line labels=labels or {} guard, matching what the production base class already does.

  • test_initialize_loads_datasets_into_memory — empty AttackTechniqueRegistry (tests/integration/datasets/test_load_default_datasets_integration.py): LoadDefaultDatasets.initialize_async() calls ScenarioRegistry.list_metadata(), which instantiates every registered scenario class including Cyber. Cyber.__init__ builds its strategy enum from AttackTechniqueRegistry, raising RuntimeError when the registry is empty. The test was calling initialize_pyrit_async but not ScenarioTechniqueInitializer, which is the initializer responsible for populating the registry. Added await ScenarioTechniqueInitializer().initialize_async() before the LoadDefaultDatasets call.

  • test_harmbench_metadata_parses_correctly — wrong size assertion (tests/integration/datasets/test_seed_dataset_provider_integration.py): The test asserted metadata.size == {"large"} but _HarmBenchDataset._parse_metadata() returns {"medium"}. Updated the assertion to match the actual metadata.

  • test_execute_notebooks[0_scenarios.ipynb] and related scenario notebooks — empty AttackTechniqueRegistry (doc/code/scenarios/0_scenarios.py, 1_common_scenario_parameters.py, 2_custom_scenario_parameters.py): Same root cause as the LoadDefaultDatasets test above. Each notebook calls initialize_pyrit_async (or initialize_from_config_async) but not ScenarioTechniqueInitializer, so the registry is empty when list_scenarios_async() or scenario constructors are invoked. Added a single await ScenarioTechniqueInitializer().initialize_async() at the top of each notebook's setup cell. Also added missing initialize_pyrit_async to 2_custom_scenario_parameters.py which had no memory setup at all.

Tests and Documentation

Updates listed above.

Victor Valbuena and others added 2 commits June 2, 2026 13:14
labels required to be non-None in order for MessagePiece not to raise and interrupt test_retry_timing_integration. Fix was to apply labels or {} on line 72 of tests/integration/mocks.py.
…ntegration test

ScenarioRegistry.list_metadata() instantiates every registered scenario
class to build metadata. Cyber.__init__ calls _build_cyber_strategy(),
which calls AttackTechniqueRegistry.get_factories_or_raise() — raising
RuntimeError when the registry is empty.

The integration test was missing the ScenarioTechniqueInitializer step
that populates the registry. Add it before LoadDefaultDatasets.initialize_async().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@ValbuenaVC ValbuenaVC marked this pull request as ready for review June 3, 2026 00:35
@ValbuenaVC ValbuenaVC added this pull request to the merge queue Jun 3, 2026
Merged via the queue into microsoft:main with commit 23b862d Jun 3, 2026
52 checks passed
@ValbuenaVC ValbuenaVC deleted the integration_test_fixes branch June 3, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants