Remove CI scanner token budget limit#35818
Conversation
Raise the CI failure scanner effective-token budget from the default 25M to 50M for both main and net11 scanners so scheduled scans can complete after exceeding the previous cap. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 35818Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 35818" |
PureWeen
left a comment
There was a problem hiding this comment.
PR is clean as a focused configuration change. 3 independent reviewers analyzed it via adversarial consensus; methodology and verdicts below.
Methodology
3 independent reviewers (different model families) reviewed in parallel. 2 findings reached 2/3 consensus and are surfaced inline as non-blocking 💡 suggestions. 2 additional findings raised by a single reviewer went through a dispute round and were rejected by both other reviewers with concrete reasoning (see below).
Verified clean
50Mis valid gh-aw syntax.gh aw compile v0.77.5resolved it to50000000in all three required lock-file locations (main AWFapiProxy.maxEffectiveTokens,GH_AW_MAX_EFFECTIVE_TOKENSenv var, threat-detection AWF config). Both lock files show consistent values everywhere.frontmatter_hashupdated as expected;body_hashunchanged in both lock files — confirms only frontmatter was edited and the lock files were not hand-touched.- Heredoc identifier regeneration (
GH_AW_PROMPT_*,GH_AW_SAFE_OUTPUTS_CONFIG_*,GH_AW_MCP_CONFIG_*) is normal compiler artifact behavior on any frontmatter change, not a defect. - The two scanners (main, net11.0) received symmetric treatment, matching their otherwise-symmetric configuration. They differ only in
checkout.ref, concurrency group, and label prefix. - Right knob raised: the AWF
apiProxy.maxEffectiveTokensis what gates LLM calls. The author's audit numbers (25,137,681 and 25,088,941 effective tokens) sit right at the prior 25M cap, confirming the firewall was the binding limit before this change. - No new permissions granted, no new capabilities added; safe-output caps (
create-issue max: 5, label-locked to[ci-scan]) unchanged. AWF firewall remains the binding guardrail at a higher ceiling. - Only these two workflows use
max-effective-tokensin the repo; no other consumers affected.
Surfaced inline (2/3 consensus, 💡 non-blocking)
See inline comment on ci-status-main.md (applies equally to ci-status-net11.md):
- Consider adding
max-daily-effective-tokensas a 24h backstop. - Follow-up opportunity: pre-digest CI logs in deterministic
steps:to reduce baseline agent token burn.
Discarded after dispute round (1/3 → 0/2 in follow-up)
⚠️ "timeout-minutes: 60will be hit before 50M tokens are consumed" — Rejected. The author's audit data (25.1M / 25.0M effective tokens "before failing") shows the prior runs ended at the token cap, not at the timeout. If 60 min had been the binding cutoff at 25M, the runs would have stopped at an arbitrary token count, not precisely at the 25M ceiling. At realistic effective-token throughput for sonnet-4.6 with tool round-trips, 50M lands well inside 60 minutes;timeout-minutesandmax-effective-tokensare designed as independent belt-and-suspenders guardrails.⚠️ "Missingmin-integrity: approvedwidens prompt-injection window" — Rejected. The scanner ingests its CI failure data via directcurlfrom bash againstdev.azure.com/helix.dot.net/*.blob.core.windows.net— a path that does NOT traverse the GitHub MCP integrity filter thatmin-integritywould gate. Triggers (schedule: every 12h,workflow_dispatch) carry no external untrusted payload, and the lock files showdetermine-automatic-lockdownalready wires runtime integrity guards. Doubling the token budget doesn't meaningfully widen the prompt-injection surface (the agent could already ingest 25M of untrusted log content; 50M is incremental headroom). Defense-in-depth at most, out of scope here.
Test coverage
N/A — configuration-only change. Author documented gh aw compile, gh aw validate, and git diff --check ran clean locally.
Prior reviews
None. Only an automated dogfood-pr bot comment exists; not duplicated.
| report-as-issue: false | ||
|
|
||
| timeout-minutes: 60 | ||
| max-effective-tokens: 50M |
There was a problem hiding this comment.
💡 Cost / Performance — Two non-blocking suggestions to pair with this bump (same applies to ci-status-net11.md):
-
Add a daily backstop. Consider also setting
max-daily-effective-tokens: 100M(or similar) as a 24h-per-workflow ceiling. Withschedule: every 12h× 2 branches × 50M/run, the unbounded daily total could reach ~200M effective tokens/day across both scanners before manual intervention — e.g., a runaway loop would burn the full 50M every 12h with no daily cap to stop it. The per-run guardrail you preserved here is the right call; a daily cap would complement it. -
Worth a follow-up: pre-digest CI logs in
steps:. Prior runs hit ~25M effective tokens before failing, which is large for a CI failure scan. Likely root cause is the agent doing iterativecurl/grep/catround-trips against full AzDO/Helix log payloads inside the agent loop. A future change could move bulk log fetch + filter into deterministic pre-agentsteps:(e.g., download and pre-filter[FAIL]/error lines into a slim artifact under/tmp/gh-aw/agent/) so the agent reads pre-digested input instead of streaming raw logs. The 50M bump is a fine short-term unblock — root-cause is out of scope for this PR.
Flagged by 2/3 reviewers.
Disable the effective-token cap for both CI failure scanner workflows by setting max-effective-tokens to -1 and regenerating the gh-aw lock files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Quick follow-up after comparing against the upstream scanner in Upstream comparison
Confirmed by grepping the compiled What this reinforces💡 Root-cause-vs-symptom suggestion (from main review) has direct upstream precedent. Looking at One small additional 💡Consider matching upstream's Nothing here changes the bottom line — this PR is still a clean, internally consistent configuration bump and the lock-file regeneration is correct. |
<!-- Please let the below note in for people that find this PR --> > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could [test the resulting artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from this PR and let us know in a comment if this change resolves your issue. Thank you! ## Overview Fixes dotnet#35812 by disabling the effective-token cap for both CI failure scanner gh-aw workflows. The failed scheduled scanner runs were using the default compiled firewall budget of 25M effective tokens and exhausted it before the scanner could finish or emit safe outputs. Since these scanners need to inspect multiple MAUI CI pipelines and Helix logs in one scheduled sweep, this PR sets `max-effective-tokens: -1` for both the `main` and `net11.0` scanners instead of applying another fixed cap. ## Local token check - `gh aw audit 27177841477 --json`: `CI Failure Scanner` consumed 25,137,681 effective tokens before failing. - `gh aw audit 27177841662 --json`: `CI Failure Scanner (net11.0)` consumed 25,088,941 effective tokens before failing. - `max-effective-tokens: -1` compiles successfully with gh-aw v0.77.5 and removes the compiled firewall `maxEffectiveTokens` cap. ## Validation - `gh aw compile ci-status-main ci-status-net11 --no-check-update --approve` - `gh aw validate ci-status-main ci-status-net11 --no-check-update` - `git diff --check` --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
Overview
Fixes #35812 by disabling the effective-token cap for both CI failure scanner gh-aw workflows.
The failed scheduled scanner runs were using the default compiled firewall budget of 25M effective tokens and exhausted it before the scanner could finish or emit safe outputs. Since these scanners need to inspect multiple MAUI CI pipelines and Helix logs in one scheduled sweep, this PR sets
max-effective-tokens: -1for both themainandnet11.0scanners instead of applying another fixed cap.Local token check
gh aw audit 27177841477 --json:CI Failure Scannerconsumed 25,137,681 effective tokens before failing.gh aw audit 27177841662 --json:CI Failure Scanner (net11.0)consumed 25,088,941 effective tokens before failing.max-effective-tokens: -1compiles successfully with gh-aw v0.77.5 and removes the compiled firewallmaxEffectiveTokenscap.Validation
gh aw compile ci-status-main ci-status-net11 --no-check-update --approvegh aw validate ci-status-main ci-status-net11 --no-check-updategit diff --check