Skip to content

[aw-failures] Make setup-gh-aw extension install idempotent — gh extension install fails when gh-aw is already installed #33239

@github-actions

Description

@github-actions

Problem statement

The Setup gh-aw extension step in the shared setup-gh-aw action runs gh extension install github/gh-aw unconditionally. When the runner image (or a previous step) has already installed the extension, the gh CLI fails with:

Installing gh-aw extension...
there is already an installed extension that provides the "aw" command
##[error]Process completed with exit code 1.

The downstream MCP-server bootstrap step then cannot locate ${RUNNER_TEMP}/gh-aw/gh-aw and emits:

::error::Failed to find gh-aw binary for MCP server
exit 1

The entire agent job exits 1, taking the workflow with it.

Affected workflows and runs

  • Static Analysis Report — run §26079272509 (2026-05-19 05:59 UTC, scheduled, branch main, conclusion failure). Tracked by [aw] Static Analysis Report failed #33229.
  • Any workflow on main (or any branch) that uses the shared setup-gh-aw action on a runner where gh-aw is already installed is exposed. This is a strong candidate for being the silent cause of additional failure conclusions over the next 24h once the runner image change propagates.

Evidence

Failing step log excerpt (run 26079272509)
2026-05-19T06:03:38.21Z   echo "::error::Failed to find gh-aw binary for MCP server"
2026-05-19T06:03:38.21Z   exit 1
2026-05-19T06:03:38.26Z Installing gh-aw extension...
2026-05-19T06:03:38.28Z there is already an installed extension that provides the "aw" command
2026-05-19T06:03:38.28Z ##[error]Process completed with exit code 1.
Audit summary
{
  "workflow_name": "Static Analysis Report",
  "conclusion": "failure",
  "duration": "4.5m",
  "turns": 37,
  "errors": [
    { "file": "agent", "type": "step_failure", "message": "##[error]Process completed with exit code 1." }
  ],
  "mcp_failures": null,
  "missing_tools": null
}

Probable root cause

The install step is not idempotent. gh extension install exits non-zero whenever the extension is already present, and the surrounding shell uses set -e, so a single non-zero exit terminates the job.

Proposed remediation

Make the install idempotent. Any of the following would work; the first is the smallest diff:

  1. Detect + upgrade. In actions/setup/setup-gh-aw (or wherever Setup gh-aw extension is defined), replace the install command with:
    if gh extension list | grep -qE '(^| )gh-aw( |$)'; then
      gh extension upgrade github/gh-aw || true
    else
      gh extension install github/gh-aw
    fi
  2. Force-reinstall. Use gh extension install --force github/gh-aw (single command, but masks accidental same-name conflicts).
  3. Remove first. gh extension remove gh-aw 2>/dev/null || true; gh extension install github/gh-aw.

Fix (1) is recommended because it keeps the version current without re-downloading on every run and surfaces real install errors.

Success criteria

  • A new scheduled run of Static Analysis Report succeeds on a runner where gh-aw is already pre-installed.
  • No other workflows regress on runners with a fresh image (where the extension is not pre-installed).
  • The setup-gh-aw action exits 0 in both states in a synthetic test (CI matrix with and without gh extension install github/gh-aw pre-step).

References

Generated by 🔍 [aw] Failure Investigator (6h) · ● 21M ·

  • expires on May 26, 2026, 8:13 AM UTC

New occurrence (Failure Investigator, 2026-05-19 13:47 UTC)

The non-idempotent install bug reproduced again in the last 6 hours, confirming the fix in PR #33240 has not yet landed and the bug is still actively breaking scheduled workflows.

Affected run

Field Value
Workflow Daily Copilot Token Usage Audit
Run §26096836610
Trigger schedule
Branch main
Engine copilot (claude-sonnet-4.5, firewall v0.25.49)
Conclusion failure after 6.4m
Turns 0 (agent never started)
Failing step log excerpt (line 5488 of `agent/5_agent.txt`)
2026-05-19T12:28:13.27Z Installing gh-aw extension...
2026-05-19T12:28:13.30Z there is already an installed extension that provides the "aw" command
2026-05-19T12:28:13.30Z ##[error]Process completed with exit code 1.

The surrounding shell snippet currently shipped in the compiled workflow:

if gh extension list | grep -qE '(^|[[:space:]]|/)gh-aw($|[[:space:]]|$)'; then
  echo "gh-aw extension already installed, upgrading..."
  gh extension upgrade gh-aw || true
else
  echo "Installing gh-aw extension..."
  gh extension install github/gh-aw
fi

The else branch executed, so gh extension list output did not match the regex — yet gh extension install still rejected the install because the aw command was already provided by another extension entry. This matches the previously documented root cause and the rationale for switching the guard to gh aw --version in PR #33240.

Status of correlated work

Recommendation

Land PR #33240 (or an equivalent patch to pkg/workflow/mcp_setup_generator.go) and recompile the affected .lock.yml workflows. Until then, expect additional scheduled workflows on main to fail with the same fingerprint.

Reported by [aw] Failure Investigator (6h) — workflow run id 26101283110

Generated by 🔍 [aw] Failure Investigator (6h) · ● 10M ·

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions