Skip to content

Remove CI scanner token budget limit#35818

Merged
PureWeen merged 2 commits into
mainfrom
fix/ci-scanner-token-budget-35812
Jun 9, 2026
Merged

Remove CI scanner token budget limit#35818
PureWeen merged 2 commits into
mainfrom
fix/ci-scanner-token-budget-35812

Conversation

@kubaflo

@kubaflo kubaflo commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Overview

Fixes #35812 by disabling the effective-token cap for both CI failure scanner gh-aw workflows.

The failed scheduled scanner runs were using the default compiled firewall budget of 25M effective tokens and exhausted it before the scanner could finish or emit safe outputs. Since these scanners need to inspect multiple MAUI CI pipelines and Helix logs in one scheduled sweep, this PR sets max-effective-tokens: -1 for both the main and net11.0 scanners instead of applying another fixed cap.

Local token check

  • gh aw audit 27177841477 --json: CI Failure Scanner consumed 25,137,681 effective tokens before failing.
  • gh aw audit 27177841662 --json: CI Failure Scanner (net11.0) consumed 25,088,941 effective tokens before failing.
  • max-effective-tokens: -1 compiles successfully with gh-aw v0.77.5 and removes the compiled firewall maxEffectiveTokens cap.

Validation

  • gh aw compile ci-status-main ci-status-net11 --no-check-update --approve
  • gh aw validate ci-status-main ci-status-net11 --no-check-update
  • git diff --check

Raise the CI failure scanner effective-token budget from the default 25M to 50M for both main and net11 scanners so scheduled scans can complete after exceeding the previous cap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 35818

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 35818"

@github-actions github-actions Bot added the area-infrastructure CI, Maestro / Coherency, upstream dependencies/versions label Jun 9, 2026

@PureWeen PureWeen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is clean as a focused configuration change. 3 independent reviewers analyzed it via adversarial consensus; methodology and verdicts below.

Methodology

3 independent reviewers (different model families) reviewed in parallel. 2 findings reached 2/3 consensus and are surfaced inline as non-blocking 💡 suggestions. 2 additional findings raised by a single reviewer went through a dispute round and were rejected by both other reviewers with concrete reasoning (see below).

Verified clean

  • 50M is valid gh-aw syntax. gh aw compile v0.77.5 resolved it to 50000000 in all three required lock-file locations (main AWF apiProxy.maxEffectiveTokens, GH_AW_MAX_EFFECTIVE_TOKENS env var, threat-detection AWF config). Both lock files show consistent values everywhere.
  • frontmatter_hash updated as expected; body_hash unchanged in both lock files — confirms only frontmatter was edited and the lock files were not hand-touched.
  • Heredoc identifier regeneration (GH_AW_PROMPT_*, GH_AW_SAFE_OUTPUTS_CONFIG_*, GH_AW_MCP_CONFIG_*) is normal compiler artifact behavior on any frontmatter change, not a defect.
  • The two scanners (main, net11.0) received symmetric treatment, matching their otherwise-symmetric configuration. They differ only in checkout.ref, concurrency group, and label prefix.
  • Right knob raised: the AWF apiProxy.maxEffectiveTokens is what gates LLM calls. The author's audit numbers (25,137,681 and 25,088,941 effective tokens) sit right at the prior 25M cap, confirming the firewall was the binding limit before this change.
  • No new permissions granted, no new capabilities added; safe-output caps (create-issue max: 5, label-locked to [ci-scan]) unchanged. AWF firewall remains the binding guardrail at a higher ceiling.
  • Only these two workflows use max-effective-tokens in the repo; no other consumers affected.

Surfaced inline (2/3 consensus, 💡 non-blocking)

See inline comment on ci-status-main.md (applies equally to ci-status-net11.md):

  1. Consider adding max-daily-effective-tokens as a 24h backstop.
  2. Follow-up opportunity: pre-digest CI logs in deterministic steps: to reduce baseline agent token burn.

Discarded after dispute round (1/3 → 0/2 in follow-up)

  • ⚠️ "timeout-minutes: 60 will be hit before 50M tokens are consumed" — Rejected. The author's audit data (25.1M / 25.0M effective tokens "before failing") shows the prior runs ended at the token cap, not at the timeout. If 60 min had been the binding cutoff at 25M, the runs would have stopped at an arbitrary token count, not precisely at the 25M ceiling. At realistic effective-token throughput for sonnet-4.6 with tool round-trips, 50M lands well inside 60 minutes; timeout-minutes and max-effective-tokens are designed as independent belt-and-suspenders guardrails.
  • ⚠️ "Missing min-integrity: approved widens prompt-injection window" — Rejected. The scanner ingests its CI failure data via direct curl from bash against dev.azure.com / helix.dot.net / *.blob.core.windows.net — a path that does NOT traverse the GitHub MCP integrity filter that min-integrity would gate. Triggers (schedule: every 12h, workflow_dispatch) carry no external untrusted payload, and the lock files show determine-automatic-lockdown already wires runtime integrity guards. Doubling the token budget doesn't meaningfully widen the prompt-injection surface (the agent could already ingest 25M of untrusted log content; 50M is incremental headroom). Defense-in-depth at most, out of scope here.

Test coverage

N/A — configuration-only change. Author documented gh aw compile, gh aw validate, and git diff --check ran clean locally.

Prior reviews

None. Only an automated dogfood-pr bot comment exists; not duplicated.

Comment thread .github/workflows/ci-status-main.md Outdated
report-as-issue: false

timeout-minutes: 60
max-effective-tokens: 50M

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Cost / Performance — Two non-blocking suggestions to pair with this bump (same applies to ci-status-net11.md):

  1. Add a daily backstop. Consider also setting max-daily-effective-tokens: 100M (or similar) as a 24h-per-workflow ceiling. With schedule: every 12h × 2 branches × 50M/run, the unbounded daily total could reach ~200M effective tokens/day across both scanners before manual intervention — e.g., a runaway loop would burn the full 50M every 12h with no daily cap to stop it. The per-run guardrail you preserved here is the right call; a daily cap would complement it.

  2. Worth a follow-up: pre-digest CI logs in steps:. Prior runs hit ~25M effective tokens before failing, which is large for a CI failure scan. Likely root cause is the agent doing iterative curl/grep/cat round-trips against full AzDO/Helix log payloads inside the agent loop. A future change could move bulk log fetch + filter into deterministic pre-agent steps: (e.g., download and pre-filter [FAIL]/error lines into a slim artifact under /tmp/gh-aw/agent/) so the agent reads pre-digested input instead of streaming raw logs. The 50M bump is a fine short-term unblock — root-cause is out of scope for this PR.

Flagged by 2/3 reviewers.

Comment thread .github/workflows/ci-status-net11.md Outdated
Disable the effective-token cap for both CI failure scanner workflows by setting max-effective-tokens to -1 and regenerating the gh-aw lock files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen

PureWeen commented Jun 9, 2026

Copy link
Copy Markdown
Member

Quick follow-up after comparing against the upstream scanner in dotnet/runtime (where this workflow pattern originated). The picture there reinforces and slightly extends the 💡 suggestions from the main review.

Upstream comparison

Workflow timeout-minutes max-effective-tokens
dotnet/runtime .github/workflows/ci-failure-scan.md 90 (default 25M)
dotnet/runtime .github/workflows/ci-failure-scan-feedback.md 60 (default 25M)
dotnet/maui ci-status-main.md (this PR) 60 50M
dotnet/maui ci-status-net11.md (this PR) 60 50M

Confirmed by grepping the compiled ci-failure-scan.lock.yml — runtime has no GH_AW_MAX_EFFECTIVE_TOKENS override anywhere, so they're still on the compiled-in default 25M. None of this blocks the PR; sharing because it's useful context.

What this reinforces

💡 Root-cause-vs-symptom suggestion (from main review) has direct upstream precedent. Looking at dotnet/runtime commit history on ci-failure-scan.md over the last ~2 weeks, every change is scanner-prompt optimization rather than cap raising — e.g. emission-rule strengthening, dedup search, occurrence thresholds, do-not-disable detection, KBE match-count gates. Runtime apparently fits its scan under 25M precisely because they've invested in the prompt, not the cap. If MAUI ever needs another raise, that's a strong signal it's time to look at the scanner's data-ingestion shape rather than doubling again.

One small additional 💡

Consider matching upstream's timeout-minutes: 90. The dispute-round logic still holds for the change in this PR (the audit data showed the prior 25M runs were token-cap-bound, not timeout-bound, so 60 min was correct at 25M). But at a 50M cap, if MAUI's scanner ever grows toward the new ceiling — through more pipelines being scanned, deeper log fetches, or genuine load — 60 min could become the binding cutoff before the firewall trips. Upstream sized timeout-minutes: 90 for the same scanner pattern, and matching that would keep timeout-minutes and max-effective-tokens in the same relative posture (timeout > expected wall-clock for cap exhaustion). Non-blocking; happy to leave for a follow-up if you'd rather ship this PR as-is.

Nothing here changes the bottom line — this PR is still a clean, internally consistent configuration bump and the lock-file regeneration is correct.

@kubaflo kubaflo changed the title Increase CI scanner token budget Remove CI scanner token budget limit Jun 9, 2026
@PureWeen PureWeen merged commit d736479 into main Jun 9, 2026
4 of 5 checks passed
@PureWeen PureWeen deleted the fix/ci-scanner-token-budget-35812 branch June 9, 2026 18:54
@github-actions github-actions Bot added this to the .NET 10.0 SR8 milestone Jun 9, 2026
Dhivya-SF4094 pushed a commit to Dhivya-SF4094/maui that referenced this pull request Jun 15, 2026
<!-- Please let the below note in for people that find this PR -->
> [!NOTE]
> Are you waiting for the changes in this PR to be merged?
> It would be very helpful if you could [test the resulting
artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from
this PR and let us know in a comment if this change resolves your issue.
Thank you!

## Overview

Fixes dotnet#35812 by disabling the effective-token cap for both CI failure
scanner gh-aw workflows.

The failed scheduled scanner runs were using the default compiled
firewall budget of 25M effective tokens and exhausted it before the
scanner could finish or emit safe outputs. Since these scanners need to
inspect multiple MAUI CI pipelines and Helix logs in one scheduled
sweep, this PR sets `max-effective-tokens: -1` for both the `main` and
`net11.0` scanners instead of applying another fixed cap.

## Local token check

- `gh aw audit 27177841477 --json`: `CI Failure Scanner` consumed
25,137,681 effective tokens before failing.
- `gh aw audit 27177841662 --json`: `CI Failure Scanner (net11.0)`
consumed 25,088,941 effective tokens before failing.
- `max-effective-tokens: -1` compiles successfully with gh-aw v0.77.5
and removes the compiled firewall `maxEffectiveTokens` cap.

## Validation

- `gh aw compile ci-status-main ci-status-net11 --no-check-update
--approve`
- `gh aw validate ci-status-main ci-status-net11 --no-check-update`
- `git diff --check`

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-infrastructure CI, Maestro / Coherency, upstream dependencies/versions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] CI Failure Scanner failed

3 participants