Skip to content

Agent Shin rollout: day-7 enactment sweep + flip AGENT_SHIN_ENABLED to live-by-default#28834

Open
mateo-berri wants to merge 4 commits into
combine-28758-28117-28147from
enact-agent-shin-7day
Open

Agent Shin rollout: day-7 enactment sweep + flip AGENT_SHIN_ENABLED to live-by-default#28834
mateo-berri wants to merge 4 commits into
combine-28758-28117-28147from
enact-agent-shin-7day

Conversation

@mateo-berri

@mateo-berri mateo-berri commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Merge this 7 days after #28759. This is the second step of the Agent Shin rollout: it turns the bot on for real.

What happens on merge

  1. One-shot enactment sweep runs over every open external PR/issue. The sweep drives each item through the steady-state triage with `--close=true`:
    • PRs that fixed their description in the 7-day grace get tagged `ready for review` (via `review_gate`).
    • PRs/issues still failing the rubric get the standard 24h grace warning (or are auto-closed if a previous warning is already > 24h old).
    • Internal-author / closed items are skipped.
  2. Existing triage workflows flip from opt-in to live-by-default. `AGENT_SHIN_ENABLED` stops being the "enable" flag and becomes a kill switch: set it to the literal string `"false"` to force everything back to dry-run; any other value (including unset) leaves the bot live.

What this PR contains

File Purpose
.github/scripts/triage_rollout_enact.py The enactment sweep with `--dry-run` (default) and `--close` modes.
.github/workflows/triage_rollout_enact.yml Thin shell that fires `--close` on push to `litellm_internal_staging`.
Four existing triage workflows Inverted: `AGENT_SHIN_ENABLED:-true` and `= "false"` for kill-switch entry.
tests/test_litellm/test_triage_rollout_enact.py 22 tests; full suite is 213 passing.
tests/test_litellm/test_github_triage_workflows.py Split `DESTRUCTIVE_GATE_ENV` into per-run vs. kill-switch invariants.

Time-travel dry-run

This is the part the brief specifically asked for. The script's notion of "now" is a single `current_time` variable computed at the top of `run()`. In real mode it's wall-clock; in dry-run it defaults to `now + 24h + 1s` so you preview exactly what the next daily cron will do.

```bash

What this script would do right now (no GitHub writes):

python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm

What the next daily cron will do (24h+1s in the future):

python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm \
--simulate-future-hours 24

Pin to a specific moment for reproducibility:

python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm \
--simulate-now '2026-06-02T09:00:00Z'
```

Time-travel is implemented in exactly two places (so it's easy to audit):

  • For PRs: `review_gate(now=current_time)` — the parameter already existed for the test fixture.
  • For issues: a `_fake_now` context manager wraps each `triage()` call and patches `agent_shin_shared.dt.datetime.now` to return `current_time`. The patch is scoped to the `with` block and is restored on exit (and on exception — there's a test for it).

Dry-run vs. real run: where the if/else lives

Every GitHub mutation goes through _agent_shin_actions (shipped in #28759). Each `maybe_*` helper is:

```python
def maybe_close_pr(repo, number, *, dry_run: bool) -> None:
if dry_run:
_log(f"[DRY RUN] close PR {repo}#{number}")
return
triage_with_llm.close_pr(repo, number)
```

So a dry-run preview differs from the real run in exactly one line per side effect. The workflow drops `--close` for dry-run and adds it for real; nothing else changes.

Test plan

  • `tests/test_litellm/test_triage_rollout_enact.py` — 22 tests, all pass
  • All five triage test files (213 tests total) — all pass
  • `uv run black --check` clean
  • Workflow YAML still parses and the kill-switch guardrail tests still pass

Reviewer notes / kill-switch usage

If after merge you need to roll back: set `AGENT_SHIN_ENABLED` to the exact string `"false"` in repo Settings > Secrets and variables > Actions > Variables. The scheduled workflows (and any manual dispatch) will fall back to dry-run on the next run.

🤖 Generated with Claude Code


Note

High Risk
Merge enables real auto-comments, label changes, and PR/issue closures across open external items; incorrect gating or enactment logic could mass-close contributor work, though dry-run preview and AGENT_SHIN_ENABLED=false provide rollback paths.

Overview
This PR completes the Agent Shin rollout by running a one-shot day-7 enactment sweep and flipping automation from opt-in to live-by-default.

A new script triage_rollout_enact.py (plus triage_rollout_enact.yml) walks open external PRs and issues, evaluates them via existing review_gate / triage (preview only), then applies comments, labels, warnings, and closures through maybe_* helpers. --close performs real GitHub writes; dry-run defaults to simulating +24h so you can preview the next cron. Issue grace timing uses a scoped _fake_now patch; PRs pass now= into review_gate.

AGENT_SHIN_ENABLED is inverted everywhere it gates behavior: unset or any value except the literal "false" means the bot is live (including review_gate automatic triggers and reconsider). Setting "false" is the kill switch back to dry-run. OPENAI_API_KEY exposure in workflows follows the same rule. Per-run close inputs still require = "true" where applicable.

Workflow guardrail tests are updated (PER_RUN_GATE_ENV vs KILL_SWITCH_WORKFLOWS) and test_triage_rollout_enact.py covers dispatch, time travel, and sweep behavior.

Reviewed by Cursor Bugbot for commit 5497394. Bugbot is set up for automated code reviews on this repo. Configure here.

…-by-default

This is the second-step PR of the Agent Shin rollout: it merges 7 days
after the heads-up (#28759 + this branch) and turns the bot on for real.
Two things happen on merge:

1. A one-shot enactment sweep runs over every open external PR/issue,
   driving each through the steady-state triage logic with --close=true.
   PRs that fixed their description in the grace week get tagged
   `ready for review`; PRs/issues still failing the rubric get the
   standard 24h grace warning (or close, if they already had one and
   24h elapsed). The sweep uses the same `_agent_shin_actions` dry-run
   wrappers as the heads-up so a single boolean toggles real vs. log.

2. The four existing triage workflows (triage_pr_with_llm,
   triage_issue_with_llm, close_low_quality_prs, review_gate,
   triage_reconsider) flip from "dry-run unless AGENT_SHIN_ENABLED=true"
   to "live unless AGENT_SHIN_ENABLED=false". The variable becomes a
   kill switch instead of an opt-in. Default semantics: unset means live.

Files
-----
.github/scripts/triage_rollout_enact.py — the enactment sweep.
  * Calls review_gate(close=False, now=current_time) for PRs and
    triage(close=False) (wrapped in `_fake_now(current_time)`) for
    issues, then routes the verdict through the matching maybe_*
    wrapper. Two single-page dispatch tables (_apply_pr_result,
    _apply_issue_result) make it easy to audit which actions map to
    which mutations.
  * Time-travel dry-run: --simulate-future-hours N (default 24+1s
    when --close is not set) shifts the clock forward N hours so you
    can preview what the next daily cron will do. Implemented in
    exactly two places: review_gate's `now=` parameter for PRs, and
    a `_fake_now` context manager that patches
    `agent_shin_shared.dt.datetime.now` for issues. The context
    manager restores the original module on exit (and on exception).
  * --simulate-now ISO_TS pins the clock to a specific timestamp.
  * --close forbids the simulate flags so a real run is always at
    wall-clock time.

.github/workflows/triage_rollout_enact.yml — thin wrapper. Fires
  --close on push to litellm_internal_staging (the enactment merge)
  and offers a workflow_dispatch with dry_run + simulate_future_hours
  inputs for safe re-runs.

.github/workflows/{triage_pr_with_llm,triage_issue_with_llm,
close_low_quality_prs,review_gate,triage_reconsider}.yml — inverted:
  * `${AGENT_SHIN_ENABLED:-true}` (default live)
  * Conditional: `= "false"` enters kill-switch branch
  * OPENAI_API_KEY env: exposed unless `vars.AGENT_SHIN_ENABLED == 'false'`
  Comments updated to call out the new kill-switch semantics.

tests/test_litellm/test_github_triage_workflows.py — split the
  destructive-gate constant into PER_RUN_GATE_ENV (per-input gates
  that must still match `= "true"`) and KILL_SWITCH_WORKFLOWS (all
  five workflows, must match the inverted `= "false"` / `!= "false"`
  pattern). Updates the kill-switch test wording to describe the
  new semantics.

tests/test_litellm/test_triage_rollout_enact.py — 22 tests covering:
  * _fake_now patches and restores (including on exception)
  * Each branch of _apply_pr_result / _apply_issue_result with a
    recorder that captures the maybe_* call sequence and dry_run flag
  * _process_one skip-not-open / skip-internal-author
  * run() end-to-end: dry-run threads dry_run=True through every
    wrapper, real run threads False, current_time threads through to
    the per-item evaluators, --kind / --only-numbers restrict scope.

Local preview commands
----------------------
    # What this script would do right now (no GitHub writes):
    python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm

    # What the next daily cron will do (24h+1s in the future):
    python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm \\
        --simulate-future-hours 24

    # Pin to a specific moment:
    python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm \\
        --simulate-now '2026-06-02T09:00:00Z'

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR completes the Agent Shin rollout by adding a one-shot day-7 enactment sweep script/workflow and inverting AGENT_SHIN_ENABLED from an opt-in flag (= \"true\") to a kill-switch (= \"false\") across all five triage workflows.

  • triage_rollout_enact.py + triage_rollout_enact.yml: New script walks all open external PRs/issues, calls the existing review_gate/triage paths in preview mode, then routes verdicts through the maybe_* dry-run wrappers; a time-travel context manager (_fake_now) patches agent_shin_shared.dt for issue triage and threads current_time directly into review_gate for PR triage.
  • Four existing triage workflows: Shell gates changed from ${AGENT_SHIN_ENABLED:-false} = \"true\" to ${AGENT_SHIN_ENABLED:-true} != \"false\"; test_github_triage_workflows.py invariants updated to enforce the new pattern.

Confidence Score: 3/5

The enactment sweep can re-fire as a real run on any future push to litellm_internal_staging that touches the script, leaving the one-shot sweep able to re-close PRs/issues unexpectedly.

The core logic, dry-run wrappers, and kill-switch inversion are correct and well-tested. The push trigger in triage_rollout_enact.yml fires with --close on any future push touching the script, not only the enactment merge; a subsequent bug fix would silently re-run the full sweep. Stale comments in the two triage workflow files are a secondary readability issue with no runtime impact.

.github/workflows/triage_rollout_enact.yml needs a second look on its push trigger semantics; .github/workflows/triage_issue_with_llm.yml and .github/workflows/triage_pr_with_llm.yml have stale comment blocks that should be removed.

Important Files Changed

Filename Overview
.github/scripts/triage_rollout_enact.py New one-shot enactment sweep script; logic and dry-run wrappers are well-structured, but contains dead code (_FakeDt class) and an unused constant (_DEFAULT_FUTURE_SECONDS)
.github/workflows/triage_rollout_enact.yml New one-shot workflow with a push trigger that fires a real enactment sweep on any future push to litellm_internal_staging touching the script, not just the enactment merge
.github/workflows/triage_issue_with_llm.yml Kill-switch inversion applied correctly, but the old opt-in comment block (lines 69-75) was not removed and directly contradicts the new != 'false' pattern
.github/workflows/triage_pr_with_llm.yml Same stale comment issue as triage_issue_with_llm.yml; kill-switch logic itself is correct
tests/test_litellm/test_triage_rollout_enact.py 22 new unit tests with full stubbing; covers _fake_now, dispatch tables, skip cases, and end-to-end sweep loop - no real network calls
tests/test_litellm/test_github_triage_workflows.py Correctly splits DESTRUCTIVE_GATE_ENV into PER_RUN_GATE_ENV and KILL_SWITCH_WORKFLOWS; updated accepted_patterns match the new kill-switch shell idioms
.github/workflows/triage_reconsider.yml Kill-switch inversion applied cleanly; no stale comments
.github/workflows/review_gate.yml Kill-switch inversion applied cleanly
.github/workflows/close_low_quality_prs.yml Kill-switch inversion applied cleanly

Comments Outside Diff (2)

  1. .github/workflows/triage_issue_with_llm.yml, line 69-78 (link)

    P2 The first comment block (lines 69–75) is stale: it was written to justify the old = "true" opt-in gate and explicitly warns that != "false" would enable closures on typos. The very next line now uses != "false", so the surviving comment directly contradicts the implementation and would mislead anyone auditing the security model. The same stale block is present in triage_pr_with_llm.yml lines 82–88.

  2. .github/workflows/triage_pr_with_llm.yml, line 82-91 (link)

    P2 Same stale comment as in triage_issue_with_llm.yml: the block originally justified the old = "true" opt-in gate and explicitly warns that != "false" would enable closures on typos — but line 92 now uses != "false". The old warning survives and directly contradicts the new implementation.

Reviews (1): Last reviewed commit: "feat(triage): day-7 enactment sweep + fl..." | Re-trigger Greptile

Comment on lines +15 to +20
on:
push:
branches:
- litellm_internal_staging
paths:
- ".github/scripts/triage_rollout_enact.py"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Re-fire risk on future script edits

The push trigger fires — with --close in the real-run branch — on any push to litellm_internal_staging that touches triage_rollout_enact.py, not just the one-shot enactment merge. If a subsequent bug fix or refactor to the script is ever pushed to that branch, a second real enactment sweep will run and close PRs/issues that have already been processed. Consider removing the push trigger entirely after the enactment merge, or adding a sentinel file (e.g., a .enacted marker) so the workflow is a no-op on re-trigger.

Comment thread .github/scripts/triage_rollout_enact.py Outdated
Comment on lines +113 to +123
class _FakeDt:
"""Drop-in replacement for ``datetime`` with a frozen ``now``."""

timezone = real_dt.timezone
datetime = real_dt.datetime

@staticmethod
def datetime_now(tz: dt.tzinfo | None = None) -> dt.datetime:
return when if tz is None else when.astimezone(tz)

# We only need to override `dt.datetime.now`. Easiest path: install a

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The _FakeDt class is dead code: it is defined inside _fake_now but never assigned or used — _DtShim is what actually gets installed on agent_shin_shared.dt. Leaving it in place misleads readers into thinking it plays a role in the time-travel patch.

Suggested change
class _FakeDt:
"""Drop-in replacement for ``datetime`` with a frozen ``now``."""
timezone = real_dt.timezone
datetime = real_dt.datetime
@staticmethod
def datetime_now(tz: dt.tzinfo | None = None) -> dt.datetime:
return when if tz is None else when.astimezone(tz)
# We only need to override `dt.datetime.now`. Easiest path: install a
# We only need to override `dt.datetime.now`. Easiest path: install a

Comment thread .github/scripts/triage_rollout_enact.py Outdated
Comment on lines +96 to +97
DEFAULT_SIMULATE_HOURS = 24
_DEFAULT_FUTURE_SECONDS = DEFAULT_SIMULATE_HOURS * 3600 + 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _DEFAULT_FUTURE_SECONDS is defined but never referenced anywhere in the module — main() computes the offset inline as DEFAULT_SIMULATE_HOURS + 1 / 3600. The named constant adds no value and its "SECONDS" suffix misleads readers given the surrounding code works in hours.

Suggested change
DEFAULT_SIMULATE_HOURS = 24
_DEFAULT_FUTURE_SECONDS = DEFAULT_SIMULATE_HOURS * 3600 + 1
DEFAULT_SIMULATE_HOURS = 24

Comment thread .github/scripts/triage_rollout_enact.py Outdated
Comment thread .github/scripts/triage_rollout_enact.py Outdated
Comment thread .github/scripts/triage_rollout_enact.py Outdated
- _fake_now now patches triage_with_llm.dt (the alias actually used by
  the issue grace check in _seconds_since_latest_marker_comment) instead
  of agent_shin_shared.dt, which was a no-op for issues.
- Drop the dead _FakeDt inner class that was never installed.
- Drop unused _DEFAULT_FUTURE_SECONDS constant.
- Update tests to assert against triage_with_llm.dt.

Co-authored-by: Yassin Kortam <yassin@berri.ai>
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ mateo-berri
❌ cursoragent
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment thread .github/scripts/triage_rollout_enact.py
…Error

Co-authored-by: Yassin Kortam <yassin@berri.ai>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Unused imports of PR-specific comment formatters
    • Removed the unused format_grace_warning_pr_comment and format_pr_close_comment imports from the triage_with_llm import block.
Preview (5497394a19)
diff --git a/.github/scripts/triage_rollout_enact.py b/.github/scripts/triage_rollout_enact.py
new file mode 100644
--- /dev/null
+++ b/.github/scripts/triage_rollout_enact.py
@@ -1,0 +1,506 @@
+#!/usr/bin/env python3
+"""One-shot day-7 enactment sweep for the Agent Shin rollout.
+
+This runs once, on the merge commit of the enactment PR, exactly 7 days after
+the heads-up sweep. It walks every open external PR/issue and lets Agent
+Shin's steady-state logic decide what to do for each:
+
+  * PR passing the rubric  -> tag `ready for review` (via `review_gate`)
+  * PR failing the rubric, has the heads-up marker, past 24h since the warning
+                            -> close + post the standard close comment
+  * PR failing the rubric, no heads-up marker yet (created this week)
+                            -> post the 24h grace warning (steady-state path)
+  * Issue passing           -> no-op
+  * Issue failing past grace -> close + comment
+  * Issue failing in grace   -> warn (or already-warned skip)
+  * Internal-author / closed -> skip
+
+The script doesn't replicate the rubric logic — it calls into the existing
+``review_gate`` and ``triage`` paths in dry-run mode, then routes their
+verdicts through the ``maybe_*`` wrappers so a single ``dry_run`` boolean
+toggles between logging and real GitHub mutations.
+
+Time-travel dry-run
+-------------------
+``--simulate-future-hours N`` (default 24h+1s when --dry-run is set and no
+explicit value is given) shifts the script's notion of "now" forward by N
+hours. This lets you preview what the *next* scheduled run will do: any PR
+currently in the grace window will tip into "past grace" after 24h, and the
+preview shows those would-close decisions before they actually fire.
+
+Time-travel is implemented in exactly one place: a ``current_time`` variable
+is computed at the top of ``run()`` and threaded through ``review_gate(now=)``
+for PRs. For issues (whose grace check goes through
+``seconds_since_latest_marker_comment``), we patch ``triage_with_llm``'s
+``dt.datetime.now`` under a context manager for the duration of each
+``triage()`` call — one small surface, easy to audit.
+
+CLI examples
+------------
+
+::
+
+    # Pure preview at current time (no GitHub writes):
+    python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm
+
+    # Preview what the next daily cron will do (24h+1s in the future):
+    python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm \\
+        --simulate-future-hours 24
+
+    # Real run (what the workflow does on merge):
+    python3 .github/scripts/triage_rollout_enact.py --repo BerriAI/litellm --close
+"""
+
+from __future__ import annotations
+
+import argparse
+import contextlib
+import datetime as dt
+import json
+import os
+import sys
+from pathlib import Path
+from typing import Any, Iterator
+
+_SCRIPTS_DIR = Path(__file__).resolve().parent
+if str(_SCRIPTS_DIR) not in sys.path:
+    sys.path.insert(0, str(_SCRIPTS_DIR))
+
+import triage_with_llm  # noqa: E402
+from _agent_shin_actions import (  # noqa: E402
+    maybe_add_label,
+    maybe_close_issue,
+    maybe_close_pr,
+    maybe_post_comment,
+    maybe_remove_label,
+)
+from triage_with_llm import (  # noqa: E402
+    DEFAULT_MODEL,
+    READY_FOR_REVIEW_LABEL,
+    fetch_issue,
+    fetch_pr,
+    format_grace_warning_issue_comment,
+    format_issue_close_comment,
+    gh,
+    is_internal_contributor,
+    review_gate,
+    triage,
+)
+
+# Default time-travel offset: the next daily cron runs 24h after this script's
+# real run, so 24h+1s gives a "what fires tomorrow" preview without any edge
+# cases at the boundary itself.
+DEFAULT_SIMULATE_HOURS = 24
+
+
+@contextlib.contextmanager
+def _fake_now(when: dt.datetime) -> Iterator[None]:
+    """Patch ``dt.datetime.now`` in triage_with_llm so issue triage's grace
+    check resolves against ``when`` rather than wall-clock time.
+
+    Patches ``triage_with_llm.dt`` (the module's local alias) rather than
+    the global ``datetime`` so we don't leak the override into unrelated code.
+    The issue grace check reads the wall clock inside
+    ``_seconds_since_latest_marker_comment`` via this module's ``dt`` alias,
+    so this is the only surface that needs to be frozen. The patch is scoped
+    to the ``with`` block — once it exits, the original ``dt`` module is
+    restored, so a per-item call to ``triage()`` is the only code that ever
+    sees the fake clock.
+    """
+    real_dt = triage_with_llm.dt
+
+    # We only need to override `dt.datetime.now`. Easiest path: install a
+    # tiny shim that proxies to the real `datetime` module for everything
+    # except `.now()`.
+    class _DtShim:
+        timezone = real_dt.timezone
+
+        class datetime(real_dt.datetime):  # noqa: N801 - mirror stdlib name
+            @classmethod
+            def now(cls, tz: dt.tzinfo | None = None) -> dt.datetime:
+                return when if tz is None else when.astimezone(tz)
+
+        # Pass everything else through to the real module.
+        def __getattr__(self, name: str) -> Any:  # pragma: no cover - shim
+            return getattr(real_dt, name)
+
+    triage_with_llm.dt = _DtShim()
+    try:
+        yield
+    finally:
+        triage_with_llm.dt = real_dt
+
+
+def _list_open_numbers(repo: str, kind: str) -> list[int]:
+    cmd = "pr" if kind == "pr" else "issue"
+    raw = gh(
+        cmd,
+        "list",
+        "--repo",
+        repo,
+        "--state",
+        "open",
+        "--limit",
+        "1000",
+        "--json",
+        "number",
+    )
+    return [item["number"] for item in json.loads(raw)]
+
+
+# ---------------------------------------------------------------------------
+# Per-PR / per-issue dispatch.
+#
+# Each helper takes the verdict-shaped result from review_gate / triage, then
+# routes it to the matching maybe_* wrapper. The wrappers each carry the
+# `dry_run` boolean, so a dry-run preview hits exactly the same code path as
+# the real run except for the final GitHub API call.
+
+
+def _apply_pr_result(*, repo: str, number: int, result: dict, dry_run: bool) -> dict:
+    """Translate a ``review_gate`` result into the matching GitHub mutation
+    (or a dry-run log line). Returns the augmented result with the action
+    taken (``"applied"``, ``"would-apply"``, or ``"noop"``)."""
+    action = result.get("action") or "unknown"
+    comment = result.get("comment")
+    base = {"kind": "pr", "number": number, "review_gate_action": action}
+
+    # `review_gate(close=False)` returns `would-*` strings for every
+    # transition; `review_gate(close=True)` returns the already-applied
+    # counterparts. The dispatcher below treats both forms identically so the
+    # enactment script can be driven in either mode (we always run it in
+    # close=False mode to capture the would-* preview, then re-apply the
+    # mutations through the dry-run wrappers).
+    if action in ("noop-passing", "skip-not-open", "skip-internal-author"):
+        return {**base, "result": "noop"}
+    if action in ("skip-no-llm-key", "skip-llm-error"):
+        return {
+            **base,
+            "result": "noop-llm-unavailable",
+            "error": result.get("error"),
+        }
+    if action in ("would-label-ready", "labeled-ready"):
+        assert comment, "review_gate must supply a comment for label-ready"
+        maybe_post_comment(repo, number, comment, dry_run=dry_run)
+        maybe_add_label(repo, number, READY_FOR_REVIEW_LABEL, dry_run=dry_run)
+        return {**base, "result": "labeled-ready"}
+    if action in ("would-remove-label", "label-removed-regressed"):
+        assert comment
+        maybe_remove_label(repo, number, READY_FOR_REVIEW_LABEL, dry_run=dry_run)
+        maybe_post_comment(repo, number, comment, dry_run=dry_run)
+        return {**base, "result": "label-removed-regressed"}
+    if action in ("would-close", "closed"):
+        assert comment
+        maybe_post_comment(repo, number, comment, dry_run=dry_run)
+        maybe_close_pr(repo, number, dry_run=dry_run)
+        return {**base, "result": "closed"}
+    if action in ("would-notify-within-grace", "within-grace-notified"):
+        assert comment
+        maybe_post_comment(repo, number, comment, dry_run=dry_run)
+        return {**base, "result": "warned-within-grace"}
+    if action in ("within-grace-already-notified", "regressed-already-notified"):
+        return {**base, "result": "noop-already-notified"}
+    # Anything else falls through as a no-op so an unexpected verdict from
+    # review_gate (e.g. a future action string) doesn't cause a partial write.
+    return {**base, "result": "noop-unknown-action"}
+
+
+def _apply_issue_result(*, repo: str, number: int, result: dict, dry_run: bool) -> dict:
+    """Translate a ``triage`` (kind='issue') result into the matching
+    mutation. Mirrors `_apply_pr_result` for the issue half of the flow."""
+    action = result.get("action") or "unknown"
+    verdict = result.get("verdict") or {}
+    base = {"kind": "issue", "number": number, "triage_action": action}
+
+    if action in (
+        "pass-llm",
+        "pass-linked-issue",
+        "skip-not-open",
+        "skip-internal-author",
+    ):
+        return {**base, "result": "noop"}
+    if action in ("skip-no-llm-key", "skip-llm-error"):
+        return {
+            **base,
+            "result": "noop-llm-unavailable",
+            "error": result.get("error"),
+        }
+    if action in ("would-warn-grace", "warned-grace"):
+        body = format_grace_warning_issue_comment(verdict)
+        maybe_post_comment(repo, number, body, dry_run=dry_run)
+        return {**base, "result": "warned-within-grace"}
+    if action in ("skip-in-grace-period",):
+        return {**base, "result": "noop-already-warned"}
+    if action in ("would-close", "closed"):
+        body = format_issue_close_comment(verdict)
+        maybe_post_comment(repo, number, body, dry_run=dry_run)
+        maybe_close_issue(repo, number, dry_run=dry_run)
+        return {**base, "result": "closed"}
+    return {**base, "result": "noop-unknown-action"}
+
+
+def _evaluate_pr(
+    *,
+    repo: str,
+    number: int,
+    model: str,
+    current_time: dt.datetime,
+    judge: Any = None,
+) -> dict:
+    """Run ``review_gate`` in preview mode against ``current_time``.
+
+    Always uses ``close=False`` so the underlying review_gate never mutates
+    GitHub directly — the enactment script is the single source of mutations
+    and routes everything through the dry-run wrappers.
+    """
+    return review_gate(
+        repo=repo,
+        number=number,
+        close=False,
+        model=model,
+        judge=judge,
+        now=current_time,
+    )
+
+
+def _evaluate_issue(
+    *,
+    repo: str,
+    number: int,
+    model: str,
+    current_time: dt.datetime,
+    judge: Any = None,
+) -> dict:
+    """Run ``triage(kind='issue')`` in preview mode against ``current_time``.
+
+    ``triage`` doesn't accept a ``now`` parameter, so the time-travel patch is
+    applied here (the only place issues touch the wall clock is the
+    grace-warning age check inside ``seconds_since_latest_marker_comment``).
+    """
+    with _fake_now(current_time):
+        return triage(
+            repo=repo,
+            kind="issue",
+            number=number,
+            close=False,
+            model=model,
+            judge=judge,
+        )
+
+
+def _process_one(
+    *,
+    repo: str,
+    kind: str,
+    number: int,
+    model: str,
+    dry_run: bool,
+    current_time: dt.datetime,
+    judge: Any = None,
+) -> dict:
+    """Evaluate one PR/issue and apply the resulting mutation via the
+    maybe_* wrappers. Skip-cases (not-open, internal author, no key) short-
+    circuit before any LLM call."""
+    fetcher = fetch_pr if kind == "pr" else fetch_issue
+    item = fetcher(repo, number)
+
+    if (item.get("state") or "") != "open":
+        return {"kind": kind, "number": number, "result": "skip-not-open"}
+    if is_internal_contributor(item):
+        return {"kind": kind, "number": number, "result": "skip-internal-author"}
+
+    if kind == "pr":
+        result = _evaluate_pr(
+            repo=repo,
+            number=number,
+            model=model,
+            current_time=current_time,
+            judge=judge,
+        )
+        return _apply_pr_result(
+            repo=repo, number=number, result=result, dry_run=dry_run
+        )
+    result = _evaluate_issue(
+        repo=repo,
+        number=number,
+        model=model,
+        current_time=current_time,
+        judge=judge,
+    )
+    return _apply_issue_result(repo=repo, number=number, result=result, dry_run=dry_run)
+
+
+def _print_summary(results: list[dict], *, current_time: dt.datetime) -> None:
+    counts: dict[str, int] = {}
+    for r in results:
+        counts[r.get("result") or "unknown"] = (
+            counts.get(r.get("result") or "unknown", 0) + 1
+        )
+    print(f"\n=== enactment summary (clock={current_time.isoformat()}) ===")
+    for action in sorted(counts):
+        print(f"  {action:35s} {counts[action]}")
+    print(f"  total                                {len(results)}")
+
+
+def run(
+    *,
+    repo: str,
+    close: bool,
+    model: str,
+    current_time: dt.datetime,
+    kinds: tuple[str, ...] = ("pr", "issue"),
+    judge: Any = None,
+    only_numbers: dict[str, list[int]] | None = None,
+) -> list[dict]:
+    """Sweep ``repo`` and apply the enactment verdicts. Returns per-item results."""
+    dry_run = not close
+    mode_label = "DRY RUN" if dry_run else "REAL RUN"
+    print(
+        f"[{mode_label}] enactment sweep over {repo} at clock={current_time.isoformat()}"
+    )
+
+    results: list[dict] = []
+    for kind in kinds:
+        numbers = list((only_numbers or {}).get(kind, [])) or _list_open_numbers(
+            repo, kind
+        )
+        print(f"\n--- {kind}s: {len(numbers)} open ---")
+        for n in numbers:
+            try:
+                result = _process_one(
+                    repo=repo,
+                    kind=kind,
+                    number=n,
+                    model=model,
+                    dry_run=dry_run,
+                    current_time=current_time,
+                    judge=judge,
+                )
+            except Exception as exc:  # noqa: BLE001 - per-item errors don't abort
+                result = {
+                    "kind": kind,
+                    "number": n,
+                    "result": "error",
+                    "error": str(exc),
+                }
+                print(f"!! {kind}#{n}: {exc}", file=sys.stderr)
+            print(f"  {kind}#{n}: {result.get('result')}")
+            results.append(result)
+    _print_summary(results, current_time=current_time)
+    return results
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--repo", required=True, help="owner/repo")
+    parser.add_argument(
+        "--close",
+        action="store_true",
+        help=(
+            "Actually post comments and close PRs/issues. Without this flag "
+            "the script runs in dry-run mode and only logs what it would do."
+        ),
+    )
+    parser.add_argument(
+        "--simulate-future-hours",
+        type=float,
+        default=None,
+        help=(
+            "Dry-run only: pretend the wall clock is N hours in the future "
+            f"(default when --close is NOT set: {DEFAULT_SIMULATE_HOURS}h+1s, "
+            "so you preview exactly what the next daily run will do). Set to "
+            "0 to preview at the current clock instead."
+        ),
+    )
+    parser.add_argument(
+        "--simulate-now",
+        type=str,
+        default=None,
+        help=(
+            "Dry-run only: pin the wall clock to this ISO-8601 timestamp "
+            "(e.g. '2026-06-02T09:00:00Z'). Overrides --simulate-future-hours."
+        ),
+    )
+    parser.add_argument(
+        "--model",
+        default=os.environ.get("TRIAGE_MODEL") or DEFAULT_MODEL,
+        help=f"Model for the rubric LLM judge (default: {DEFAULT_MODEL}).",
+    )
+    parser.add_argument(
+        "--kind",
+        choices=("pr", "issue", "both"),
+        default="both",
+        help="Restrict the sweep to PRs or issues only (default: both).",
+    )
+    parser.add_argument(
+        "--only-pr",
+        type=int,
+        action="append",
+        default=[],
+        help="Limit the PR sweep to these PR numbers (repeat for several).",
+    )
+    parser.add_argument(
+        "--only-issue",
+        type=int,
+        action="append",
+        default=[],
+        help="Limit the issue sweep to these issue numbers (repeat for several).",
+    )
+    args = parser.parse_args()
+
+    if args.close and (
+        args.simulate_future_hours is not None or args.simulate_now is not None
+    ):
+        parser.error(
+            "--simulate-future-hours / --simulate-now are only valid in dry-run "
+            "(omit --close to preview a future clock)."
+        )
+
+    if args.close and not os.environ.get("OPENAI_API_KEY"):
+        parser.error("OPENAI_API_KEY must be set for --close (real-run) mode.")
+
+    # Resolve the script's notion of "now".
+    if args.simulate_now is not None:
+        current_time = dt.datetime.fromisoformat(
+            args.simulate_now.replace("Z", "+00:00")
+        )
+        if current_time.tzinfo is None:
+            parser.error(
+                "--simulate-now must include a timezone offset "
+                "(e.g. '2026-06-02T09:00:00Z' or '2026-06-02T09:00:00+00:00')."
+            )
+    else:
+        offset = (
+            args.simulate_future_hours
+            if args.simulate_future_hours is not None
+            else (0 if args.close else DEFAULT_SIMULATE_HOURS + 1 / 3600)
+        )
+        current_time = dt.datetime.now(dt.timezone.utc) + dt.timedelta(hours=offset)
+
+    kinds: tuple[str, ...]
+    if args.kind == "pr":
+        kinds = ("pr",)
+    elif args.kind == "issue":
+        kinds = ("issue",)
+    else:
+        kinds = ("pr", "issue")
+
+    only: dict[str, list[int]] = {}
+    if args.only_pr:
+        only["pr"] = args.only_pr
+    if args.only_issue:
+        only["issue"] = args.only_issue
+
+    run(
+        repo=args.repo,
+        close=args.close,
+        model=args.model,
+        current_time=current_time,
+        kinds=kinds,
+        only_numbers=only or None,
+    )
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())

diff --git a/.github/workflows/close_low_quality_prs.yml b/.github/workflows/close_low_quality_prs.yml
--- a/.github/workflows/close_low_quality_prs.yml
+++ b/.github/workflows/close_low_quality_prs.yml
@@ -64,10 +64,10 @@
       - name: Run low-quality PR closer
         env:
           GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-          # Scheduled runs are ALWAYS dry-run, even when AGENT_SHIN_ENABLED is
-          # "true", so the team can QA the closer's verdicts in step summaries
-          # before any contributor sees a PR closed. Real closures only happen
-          # on manual workflow_dispatch with close=true (and the variable set).
+          # Scheduled runs are ALWAYS dry-run, even after the rollout, so the
+          # team can QA the closer's verdicts in step summaries before any
+          # contributor sees a PR closed. Real closures only happen on manual
+          # workflow_dispatch with close=true (and the kill switch off).
           CLOSE_FLAG: ${{ github.event.inputs.close || 'false' }}
           AGENT_SHIN_ENABLED: ${{ vars.AGENT_SHIN_ENABLED }}
           MIN_AGE_DAYS: ${{ github.event.inputs.min_age_days || '0' }}
@@ -81,12 +81,15 @@
             --min-score "${MIN_SCORE}"
             --limit "${LIMIT}"
           )
-          if [ "${AGENT_SHIN_ENABLED:-false}" != "true" ]; then
-            echo "::notice::AGENT_SHIN_ENABLED is not 'true' -> forcing dry-run regardless of close input."
+          # Kill switch: AGENT_SHIN_ENABLED="false" forces dry-run. Default
+          # (unset / any other value) is "live", matching the post-enactment
+          # rollout state.
+          if [ "${AGENT_SHIN_ENABLED:-true}" = "false" ]; then
+            echo "::notice::Agent Shin kill switch is ON (AGENT_SHIN_ENABLED='false'). Forcing dry-run."
           elif [ "${GITHUB_EVENT_NAME:-}" = "workflow_dispatch" ] && [ "${CLOSE_FLAG}" = "true" ]; then
             ARGS+=(--close)
             echo "::notice::Running in close-on-fail mode."
           else
-            echo "::notice::AGENT_SHIN_ENABLED is true but this trigger is dry-run (scheduled event or close=false)."
+            echo "::notice::Kill switch off but this trigger is dry-run (scheduled event or close=false)."
           fi
           python3 .github/scripts/close_low_quality_prs.py "${ARGS[@]}"

diff --git a/.github/workflows/review_gate.yml b/.github/workflows/review_gate.yml
--- a/.github/workflows/review_gate.yml
+++ b/.github/workflows/review_gate.yml
@@ -79,7 +79,9 @@
           # enabled or a collaborator triggers it manually, so an external user
           # can't force paid LLM calls by churning a fork PR while the bot is
           # still in dry-run.
-          OPENAI_API_KEY: ${{ (vars.AGENT_SHIN_ENABLED == 'true' || github.event_name == 'workflow_dispatch') && secrets.OPENAI_API_KEY || '' }}
+          # Kill-switch semantics: only suppress the LLM key when the variable
+          # is literally "false". Unset / any other value -> bot is live.
+          OPENAI_API_KEY: ${{ (vars.AGENT_SHIN_ENABLED != 'false' || github.event_name == 'workflow_dispatch') && secrets.OPENAI_API_KEY || '' }}
           OPENAI_BASE_URL: ${{ vars.OPENAI_BASE_URL }}
           TRIAGE_MODEL: ${{ vars.TRIAGE_MODEL }}
           AGENT_SHIN_ENABLED: ${{ vars.AGENT_SHIN_ENABLED }}
@@ -92,20 +94,20 @@
           set -euo pipefail
           COMMON=(--review-gate --grace-days "${GRACE_DAYS}" --min-greptile-score "${MIN_GREPTILE_SCORE}")
 
-          # Fail-safe gating, identical philosophy to the Greptile closer:
-          #   - AGENT_SHIN_ENABLED must be the EXACT string "true" to act at all.
-          #   - A manual dispatch can still preview with close=false.
-          #   - Automatic triggers (PR events, schedule) act once enabled — that
-          #     is the whole point of the gate (re-tag / un-tag automatically).
+          # Kill-switch gating:
+          #   - AGENT_SHIN_ENABLED="false" (the exact string) forces dry-run.
+          #   - Any other value, including unset, leaves the bot live.
+          #   - A manual dispatch can still preview with close=false even
+          #     when the bot is live.
           DO_CLOSE="false"
-          if [ "${AGENT_SHIN_ENABLED:-false}" != "true" ]; then
-            echo "::notice::AGENT_SHIN_ENABLED is not 'true' -> dry-run (no labels/comments/closes)."
+          if [ "${AGENT_SHIN_ENABLED:-true}" = "false" ]; then
+            echo "::notice::Agent Shin kill switch is ON (AGENT_SHIN_ENABLED='false'). Forcing dry-run."
           elif [ "${GITHUB_EVENT_NAME:-}" = "workflow_dispatch" ] && [ "${CLOSE_FLAG:-false}" = "true" ]; then
             DO_CLOSE="true"
             echo "::notice::Manual run -> acting for real."
           elif [ "${GITHUB_EVENT_NAME:-}" != "workflow_dispatch" ]; then
             DO_CLOSE="true"
-            echo "::notice::Enabled automatic trigger (${GITHUB_EVENT_NAME:-}) -> acting for real."
+            echo "::notice::Automatic trigger (${GITHUB_EVENT_NAME:-}) -> acting for real (kill switch off)."
           else
             echo "::notice::Manual dispatch with close=false -> dry-run."
           fi

diff --git a/.github/workflows/triage_issue_with_llm.yml b/.github/workflows/triage_issue_with_llm.yml
--- a/.github/workflows/triage_issue_with_llm.yml
+++ b/.github/workflows/triage_issue_with_llm.yml
@@ -2,9 +2,9 @@
 
 # LLM-as-judge triage for external GitHub issues.
 #
-# DRY-RUN BY DEFAULT. See .github/workflows/triage_pr_with_llm.yml for the
-# enablement procedure — same repo variable (`AGENT_SHIN_ENABLED=true`)
-# unlocks the PR and issue triage flows together.
+# LIVE BY DEFAULT. See .github/workflows/triage_pr_with_llm.yml for the
+# kill-switch procedure — same repo variable (`AGENT_SHIN_ENABLED="false"`)
+# forces the PR and issue triage flows back into dry-run together.
 
 on:
   issues:
@@ -15,7 +15,7 @@
         description: "Issue number to triage manually."
         required: true
       close:
-        description: "If true and AGENT_SHIN_ENABLED=true, actually close on fail."
+        description: "If true (and AGENT_SHIN_ENABLED != 'false'), actually close on fail."
         required: false
         default: "false"
         type: choice
@@ -55,7 +55,9 @@
           # The Python script calls the LLM whenever this var is set
           # (regardless of `--close`); stripping `--close` doesn't suppress
           # the API call, only the destructive side effects.
-          OPENAI_API_KEY: ${{ (vars.AGENT_SHIN_ENABLED == 'true' || github.event_name == 'workflow_dispatch') && secrets.OPENAI_API_KEY || '' }}
+          # Kill-switch semantics: only suppress the LLM key when the variable
+          # is literally "false". Unset / any other value -> bot is live.
+          OPENAI_API_KEY: ${{ (vars.AGENT_SHIN_ENABLED != 'false' || github.event_name == 'workflow_dispatch') && secrets.OPENAI_API_KEY || '' }}
           OPENAI_BASE_URL: ${{ vars.OPENAI_BASE_URL }}
           TRIAGE_MODEL: ${{ vars.TRIAGE_MODEL }}
           AGENT_SHIN_ENABLED: ${{ vars.AGENT_SHIN_ENABLED }}
@@ -71,13 +73,16 @@
           # string, and a `!= "false"` check would treat "True", "yes",
           # "1", "TRUE", typos, and accidental whitespace as enabling
           # closure. Mirror the Greptile closer's `= "true"` pattern.
-          if [ "${AGENT_SHIN_ENABLED:-false}" = "true" ] && [ "${DISPATCH_CLOSE:-false}" = "true" ]; then
+          # Kill switch: AGENT_SHIN_ENABLED="false" forces dry-run even when
+          # the dispatch input asks for close. Default (unset / any other
+          # value) is "live", matching the post-enactment rollout state.
+          if [ "${AGENT_SHIN_ENABLED:-true}" != "false" ] && [ "${DISPATCH_CLOSE:-false}" = "true" ]; then
             ARGS+=(--close)
-            echo "::notice::Agent Shin is ENABLED and running in close-on-fail mode."
-          elif [ "${AGENT_SHIN_ENABLED:-false}" = "true" ]; then
-            echo "::notice::Agent Shin is ENABLED but this trigger is dry-run (workflow_dispatch close != 'true')."
+            echo "::notice::Agent Shin is LIVE — running in close-on-fail mode."
+          elif [ "${AGENT_SHIN_ENABLED:-true}" != "false" ]; then
+            echo "::notice::Agent Shin is LIVE but this trigger is dry-run (workflow_dispatch close != 'true')."
           else
-            echo "::notice::Agent Shin is in DRY-RUN mode (AGENT_SHIN_ENABLED is not 'true'). No comments will be posted; no issues will be closed."
+            echo "::notice::Agent Shin kill switch is ON (AGENT_SHIN_ENABLED='false'). Forcing dry-run."
           fi
           # Automatic `issues` events stay dry-run regardless until the team
           # explicitly invokes workflow_dispatch with close=true.

diff --git a/.github/workflows/triage_pr_with_llm.yml b/.github/workflows/triage_pr_with_llm.yml
--- a/.github/workflows/triage_pr_with_llm.yml
+++ b/.github/workflows/triage_pr_with_llm.yml
@@ -2,15 +2,15 @@
 
 # LLM-as-judge triage for external pull requests.
 #
-# DRY-RUN BY DEFAULT. Closures and public comments are gated on the repo
-# variable `AGENT_SHIN_ENABLED` being set to the string `"true"`. Until then,
-# every run only writes its verdict to the workflow step summary so the team
-# can QA the judge's decisions before flipping it on.
+# LIVE BY DEFAULT. Closures and public comments fire whenever the workflow is
+# triggered and the dispatch input asks for `close=true`. The repo variable
+# `AGENT_SHIN_ENABLED` acts as a kill switch — set it to the exact string
+# `"false"` (Settings > Secrets and variables > Actions > Variables) to force
+# every run back to dry-run. Any other value, including unset, leaves the bot
+# enabled.
 #
-# To enable for real:
-#   1. Add a repo secret `OPENAI_API_KEY` (or compatible).
-#   2. Set repo variable `AGENT_SHIN_ENABLED` to `true`
-#      (Settings > Secrets and variables > Actions > Variables).
+# Required setup:
+#   - Repo secret `OPENAI_API_KEY` (or compatible) for the LLM judge.
 #
 # We use `pull_request_target` so the workflow has access to repo secrets
 # and runs against PRs from forks. We never check out fork code — only read
@@ -25,7 +25,7 @@
         description: "PR number to triage manually."
         required: true
       close:
-        description: "If true and AGENT_SHIN_ENABLED=true, actually close on fail."
+        description: "If true (and AGENT_SHIN_ENABLED != 'false'), actually close on fail."
         required: false
         default: "false"
         type: choice
@@ -66,7 +66,11 @@
           # The Python script calls the LLM whenever this var is set
           # (regardless of `--close`); stripping `--close` doesn't suppress
           # the API call, only the destructive side effects.
-          OPENAI_API_KEY: ${{ (vars.AGENT_SHIN_ENABLED == 'true' || github.event_name == 'workflow_dispatch') && secrets.OPENAI_API_KEY || '' }}
+          # Kill-switch semantics: only suppress the LLM key when the variable
+          # is literally "false". Unset / any other value -> bot is live, key
+          # is exposed. Manual dispatch always gets the key so a collaborator
+          # can force-run even with the kill switch on.
+          OPENAI_API_KEY: ${{ (vars.AGENT_SHIN_ENABLED != 'false' || github.event_name == 'workflow_dispatch') && secrets.OPENAI_API_KEY || '' }}
           OPENAI_BASE_URL: ${{ vars.OPENAI_BASE_URL }}
           TRIAGE_MODEL: ${{ vars.TRIAGE_MODEL }}
           AGENT_SHIN_ENABLED: ${{ vars.AGENT_SHIN_ENABLED }}
@@ -82,13 +86,16 @@
           # string, and a `!= "false"` check would treat "True", "yes",
           # "1", "TRUE", typos, and accidental whitespace as enabling
           # closure. Mirror the Greptile closer's `= "true"` pattern.
-          if [ "${AGENT_SHIN_ENABLED:-false}" = "true" ] && [ "${DISPATCH_CLOSE:-false}" = "true" ]; then
+          # Kill switch: AGENT_SHIN_ENABLED="false" forces dry-run even when
+          # the dispatch input asks for close. The default (unset / any other
+          # value) is "live", matching the post-enactment rollout state.
+          if [ "${AGENT_SHIN_ENABLED:-true}" != "false" ] && [ "${DISPATCH_CLOSE:-false}" = "true" ]; then
             ARGS+=(--close)
-            echo "::notice::Agent Shin is ENABLED and running in close-on-fail mode."
-          elif [ "${AGENT_SHIN_ENABLED:-false}" = "true" ]; then
-            echo "::notice::Agent Shin is ENABLED but this trigger is dry-run (workflow_dispatch close != 'true' or scheduled event)."
+            echo "::notice::Agent Shin is LIVE — running in close-on-fail mode."
+          elif [ "${AGENT_SHIN_ENABLED:-true}" != "false" ]; then
+            echo "::notice::Agent Shin is LIVE but this trigger is dry-run (workflow_dispatch close != 'true' or scheduled event)."
           else
-            echo "::notice::Agent Shin is in DRY-RUN mode (AGENT_SHIN_ENABLED is not 'true'). No comments will be posted; no PRs will be closed."
+            echo "::notice::Agent Shin kill switch is ON (AGENT_SHIN_ENABLED='false'). Forcing dry-run."
           fi
           # On the scheduled/automatic pull_request_target trigger we default to
           # dry-run regardless, so the team can review verdicts in the step

diff --git a/.github/workflows/triage_reconsider.yml b/.github/workflows/triage_reconsider.yml
--- a/.github/workflows/triage_reconsider.yml
+++ b/.github/workflows/triage_reconsider.yml
@@ -15,11 +15,11 @@
 # (which loses the original PR's history). The bot, on the other hand,
 # has write access via GH_TOKEN and can reopen on their behalf.
 #
-# DRY-RUN BY DEFAULT — gated on `vars.AGENT_SHIN_ENABLED == 'true'` just
-# like the other Agent Shin workflows. The workflow also gates on the
-# commenter being either the PR/issue author or an internal collaborator
-# (OWNER/MEMBER/COLLABORATOR) so random commenters cannot DOS the LLM
-# judge or force a reopen.
+# LIVE BY DEFAULT — disabled only when `vars.AGENT_SHIN_ENABLED == 'false'`
+# (the kill switch shared with the other Agent Shin workflows). The workflow
+# also gates on the commenter being either the PR/issue author or an internal
+# collaborator (OWNER/MEMBER/COLLABORATOR) so random commenters cannot DOS the
+# LLM judge or force a reopen.
 
 on:
   issue_comment:
@@ -108,23 +108,20 @@
             ARGS=(--repo "${{ github.repository }}" --issue "${NUMBER}" --reconsider)
           fi
           # Reconsider's destructive actions (post comment + reopen) are
-          # gated on `--close`, mirroring the regular triage workflows.
-          # When AGENT_SHIN_ENABLED is not the EXACT string "true", we
-          # still run the script so its verdict + would-X action lands in
-          # the step summary for QA — but without `--close`, the script
-          # returns `would-reopen` / `would-reconsider-still-failing`
-          # instead of touching GitHub state.
+          # gated on `--close`. The kill switch is the shared
+          # AGENT_SHIN_ENABLED variable: setting it to the literal string
+          # "false" forces reconsider back to dry-run (the script still
+          # runs so the would-X verdict lands in the step summary for QA,
+          # but without `--close` no GitHub state changes).
           #
-          # Use the positive `= "true"` gate (not `!= "true" -> exit`) so
-          # the workflow guardrails in
-          # tests/test_litellm/test_github_triage_workflows.py see the
-          # canonical fail-safe enable pattern. Unknown values like
-          # "True", "yes", "1", or typos fall through to the dry-run
-          # branch, which is the safe default.
-          if [ "${AGENT_SHIN_ENABLED:-false}" = "true" ]; then
+          # Negative `!= "false"` against the `:-true` default keeps the
+          # kill-switch semantics symmetric with the other triage
+          # workflows — unknown values (typos, "True", "1") leave the bot
+          # live, matching the post-enactment live-by-default policy.
+          if [ "${AGENT_SHIN_ENABLED:-true}" != "false" ]; then
             ARGS+=(--close)
-            echo "::notice::Agent Shin reconsider ENABLED — running real triage (close=true)."
+            echo "::notice::Agent Shin reconsider is LIVE — running real triage (close=true)."
           else
-            echo "::notice::AGENT_SHIN_ENABLED is not 'true' -> reconsider stays in dry-run (no comment, no reopen)."
+            echo "::notice::Agent Shin kill switch is ON (AGENT_SHIN_ENABLED='false'). Reconsider stays in dry-run."
           fi
           python3 .github/scripts/triage_with_llm.py "${ARGS[@]}"

diff --git a/.github/workflows/triage_rollout_enact.yml b/.github/workflows/triage_rollout_enact.yml
new file mode 100644
--- /dev/null
+++ b/.github/workflows/triage_rollout_enact.yml
@@ -1,0 +1,84 @@
+name: Agent Shin — rollout enactment (one-shot)
+
+# Fires the day-7 enactment sweep: closes any open external PR/issue that's
+# still failing the rubric 7 days after the heads-up (because the contributor
+# didn't update the description), and applies steady-state actions
+# (ready-for-review tag, in-grace warnings) to everything else. From the
+# merge of this workflow onward, the existing daily/cron triage workflows
+# go live for real (the AGENT_SHIN_ENABLED gates are removed by the same PR).
+#
+# Thin shell over `.github/scripts/triage_rollout_enact.py`. The dry-run
+# preview path is the SAME code with --close stripped, so a local preview
+# (with --simulate-future-hours 24 to peek at tomorrow's cron run) is a
+# high-fidelity preview of what the workflow will actually do.
... diff truncated: showing 800 of 1362 lines

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 164dbb7. Configure here.

Comment thread .github/scripts/triage_rollout_enact.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants