Skip to content

FIX: unblock main CI — Test GUI (PyPI), Crescendo parser, CoCoNot empty prompt#1862

Merged
ValbuenaVC merged 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/ci-failures-on-main
Jun 1, 2026
Merged

FIX: unblock main CI — Test GUI (PyPI), Crescendo parser, CoCoNot empty prompt#1862
ValbuenaVC merged 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/ci-failures-on-main

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

Three independent CI failures on main (HEAD 5eab2f8c), one fix each. Investigation notes are in the commit message; the short version:

1. docker_buildTest GUI (PyPI)ModuleNotFoundError: No module named 'alembic'

docker/Dockerfile unconditionally COPY pyrit/ /app/pyrit/ with WORKDIR=/app, so python -m pyrit.* imported the local source instead of the PyPI-installed wheel. Local source now uses alembic (PR #1631), but PyPI 0.13.0''s metadata doesn''t list it → lifespan crash. Fix: for PYRIT_SOURCE=pypi, rm -rf the local source after install so the installed 0.13.0 wheel actually wins.

PR #1753 also moved the launcher from pyrit.cli.pyrit_backend to pyrit.backend.pyrit_backend; PyPI 0.13.0 only has the old path. docker/start.sh now tries the new module first and falls back to the legacy module for older PyPI versions. The fallback becomes dead code after the next release ships.

2. Integration Tests4_sequential_attack.ipynb — Crescendo retries exhausted on camelCase

CrescendoAttack._parse_adversarial_response required snake_case keys, but the adversarial chat returned generatedQuestion / rationaleBehindJailbreak / lastResponseSummary for three retries straight and burned the whole budget. Normalize keys to snake_case before validation; the strict extra-key check is preserved.

3. End to End Tests_CoCoNotRefusalDatasetSeed in _CoCoNotRefusalDataset has no value

original.train (wildchats subcategory) contains rows with empty prompt, producing SeedObjective(value='''') which trips tests/end_to_end/test_all_datasets.py''s assert seed.value invariant. Skip empty/whitespace prompts in the loader with a warning.

Out of scope (separate sessions)

The other failing items on this run aren''t included here and will be handled in follow-ups:

  • _ComicJailbreakDataset timeout (>300s) and _VLGuardDataset 401 gated-repo
  • airt.cyber scenario exception + Partner Integration Tests (need AzDO logs)

Verification

  • pytest tests/unit/datasets/ tests/unit/executor/attack/multi_turn/ → 865 passed
  • New regression tests added for the CoCoNot empty-prompt skip and Crescendo camelCase normalization (incl. extra-key still rejected after normalization)
  • pre-commit run --files <changed> → all hooks green (ruff, ruff-format, ty)
  • bash -n on docker/start.sh and on the Dockerfile RUN block → clean

…ty prompt

Three independent fixes for failures observed on main HEAD 5eab2f8:

1. docker_build / Test GUI (PyPI) — `ModuleNotFoundError: No module
   named 'alembic'`. `WORKDIR=/app` plus a `COPY pyrit/ /app/pyrit/`
   meant `python -m pyrit.*` imported the local source instead of the
   installed PyPI wheel. Local source pulls in `alembic` (added after
   0.13.0), so the GUI container crashed in lifespan. The Dockerfile now
   removes the local source for `PYRIT_SOURCE=pypi` after install. The
   PyPI 0.13.0 launcher also lives at `pyrit.cli.pyrit_backend`
   (PR microsoft#1753 moved it to `pyrit.backend.pyrit_backend`), so start.sh
   now falls back to the legacy module name when the new one is missing
   — the fallback becomes dead code after the next release.

2. Integration Tests / 4_sequential_attack.ipynb — Crescendo's
   `_parse_adversarial_response` required snake_case keys but the
   adversarial chat returned `generatedQuestion` /
   `rationaleBehindJailbreak` / `lastResponseSummary` for three
   retries straight. Normalize incoming keys to snake_case before
   validation; extra-key rejection is preserved.

3. End to End Tests / test_fetch_dataset[_CoCoNotRefusalDataset] —
   `original.train` (wildchats subcategory) contains rows with empty
   `prompt`, producing `SeedObjective(value='')` that trips the
   `seed.value` invariant. Skip empty/whitespace prompts in the loader
   and log a warning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jsong468 jsong468 self-assigned this Jun 1, 2026
Comment thread docker/Dockerfile Outdated
Comment thread tests/unit/executor/attack/multi_turn/test_crescendo.py
Add /app/README.md and /app/LICENSE to the rm in the PYRIT_SOURCE=pypi branch so the cleanup mirrors the COPY block one-to-one. /app/doc is intentionally retained because the later RUN block copies it into /app/notebooks/ for Jupyter mode; documented that in the comment block above the RUN.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz enabled auto-merge June 1, 2026 21:54
@romanlutz romanlutz added this pull request to the merge queue Jun 1, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 1, 2026
@ValbuenaVC ValbuenaVC added this pull request to the merge queue Jun 1, 2026
Merged via the queue into microsoft:main with commit 98966c2 Jun 1, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants