You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Agents analyzed: 16 (from 31 total runs sampled, past 2 days)
Total tokens (sample): ~165M (includes Codex high-parallelism runs)
Total cost (today): ~$5.94 | yesterday: ~$6.14
Average quality score: 86/100 (↓ 3 from 89)
Average effectiveness score: 87/100 (↓ 1 from 88)
Top performers: The Great Escapi, Contribution Check, Daily Safe Outputs Conformance Checker
Needs attention: AI Moderator (missing tool regression), Chroma Issue Indexer (extreme token usage), Semantic Function Refactoring (elevated cost)
Critical Findings
❌ P0 Ongoing: Lockdown Token Failures (3+ weeks)
4 workflows remain locked out — Issue Monster, PR Triage Agent, Daily Issues Report, Org Health Report. All fix paths closed (#17414, #17807 both rejected as "not_planned"). Manual repo admin intervention required. These failures continue to skew ecosystem quality metrics.
⚠️ AI Moderator GitHub MCP Missing Tool — Regression Detected
1 of 3 runs today (run §22453521501) reported missing GitHub MCP (read issue/comment content) tool — identical to the Docker MCP intermittency pattern last seen 2026-02-24 that was believed resolved by switching to mode: remote. With mode: remote now also showing intermittency, the root cause may be upstream GitHub MCP availability rather than Docker-specific. The other 2 runs succeeded but had very low turn counts (1–2 turns), which may indicate noop runs rather than full processing.
⚠️ Chroma Issue Indexer — Extreme Token Usage
Today's run consumed 3.6M tokens in 10.5 minutes with 102 blocked firewall requests — the highest blocked count of any workflow today. If the issue index is growing, this trend will worsen. The 47% firewall block rate across the ecosystem (439/926 requests blocked) is driven primarily by this workflow and Semantic Function Refactoring.
View Detailed Quality Analysis
Agent Quality Scores (Today)
Agent
Engine
Quality
Duration
Tokens
Cost
Notes
The Great Escapi
copilot
94/100
3.5m
74k
—
Ultra-efficient
Contribution Check
copilot
93/100
2.8m
181k
—
Fast, clean
Daily Safe Outputs Conformance Checker
claude
92/100
3.1m
134k
$0.33
Efficient
Auto-Triage Issues
copilot
90/100
3.5m
136k
—
Success
Agent Container Smoke Test
copilot
90/100
4.4m
174k
—
Clean
Smoke Copilot
copilot
90/100
6.7m
—
—
49 turns, passing
Smoke Claude
claude
87/100
12.9m
991k
$1.47
42 turns, long
Lockfile Statistics Analysis Agent
claude
87/100
5.0m
456k
$0.82
14 turns, normal
AI Moderator (×3)
codex
82/100
7.5–8.9m
210–372k
—
1/3 missing tool
Scout
claude
80/100
4.9m
613k
$0.81
19 turns
Smoke Codex
codex
80/100
6.8m
32M
—
17 turns, Codex tokens
Slide Deck Maintainer
copilot
78/100
6.7m
1.5M
—
High tokens
Changeset Generator
codex
75/100
8.2m
123M
—
Codex parallelism
Semantic Function Refactoring
claude
72/100
9.1m
295k
$3.97
High cost, 12 turns
Chroma Issue Indexer
copilot
68/100
10.5m
3.6M
—
Extreme tokens
Cancelled Runs Analysis
14 runs were cancelled in a batch (runs 22450833xxx–22450834xxx). This is expected behavior from a Release workflow trigger — these represent staggered workflow starts that were cancelled before the new release artifacts were ready. Not a quality issue.
View Effectiveness Metrics
Task Completion Rates (Sampled Agent Runs)
High completion (>80%): 13/15 agent workflows (87%)
Partial/Degraded: AI Moderator (1/3 runs degraded), Chroma Issue Indexer (functional but inefficient)
Semantic Function Refactoring: 72 blocked — consistent with "-" domain pattern
Changeset Generator: 61 blocked — Codex parallelism reaching out broadly
Slide Deck Maintainer: 43 blocked — investigating
Smoke Codex: 38 blocked — expected for engine behavior
The "-" domain appearing in blocked list is a known Serena MCP local socket artifact (see issue #18388).
View Behavioral Patterns
Productive Patterns ✅
Release → Smoke cancellation → Re-run: Expected orchestration behavior, not a failure
Daily Safe Outputs Conformance Checker: Continues to be highly efficient (3 turns, $0.33)
The Great Escapi: Maintaining minimal footprint, high reliability across 2+ weeks
Problematic Patterns ⚠️
AI Moderator GitHub MCP intermittency: 3rd occurrence of missing tool issue. Pattern: mode: remote was supposed to fix this (2026-02-24), but 1/3 runs today missing GitHub MCP again. Silent failures — moderation trigger runs but does nothing. Impact: ~33% of moderation events missed.
Chroma Issue Indexer token growth: 3.6M tokens is abnormally high for an issue indexer. If the issue backlog is growing, this will continue to scale up linearly. No issue yet created.
Codex extreme token counts: Changeset Generator (123M) and Smoke Codex (32M) show Codex engine's parallel-context behavior. Not quality issues but skew overall token metrics significantly.
Ecosystem Coverage Assessment
✅ Security: The Great Escapi active and efficient
✅ Code quality: Smoke tests (Copilot/Claude/Codex) passing on main
This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.
Performance Summary
Critical Findings
❌ P0 Ongoing: Lockdown Token Failures (3+ weeks)
4 workflows remain locked out — Issue Monster, PR Triage Agent, Daily Issues Report, Org Health Report. All fix paths closed (
#17414,#17807both rejected as "not_planned"). Manual repo admin intervention required. These failures continue to skew ecosystem quality metrics.1 of 3 runs today (run §22453521501) reported missing
GitHub MCP (read issue/comment content)tool — identical to the Docker MCP intermittency pattern last seen 2026-02-24 that was believed resolved by switching tomode: remote. Withmode: remotenow also showing intermittency, the root cause may be upstream GitHub MCP availability rather than Docker-specific. The other 2 runs succeeded but had very low turn counts (1–2 turns), which may indicate noop runs rather than full processing.Today's run consumed 3.6M tokens in 10.5 minutes with 102 blocked firewall requests — the highest blocked count of any workflow today. If the issue index is growing, this trend will worsen. The 47% firewall block rate across the ecosystem (439/926 requests blocked) is driven primarily by this workflow and Semantic Function Refactoring.
View Detailed Quality Analysis
Agent Quality Scores (Today)
Cancelled Runs Analysis
14 runs were cancelled in a batch (runs 22450833xxx–22450834xxx). This is expected behavior from a Release workflow trigger — these represent staggered workflow starts that were cancelled before the new release artifacts were ready. Not a quality issue.
View Effectiveness Metrics
Task Completion Rates (Sampled Agent Runs)
Cost Efficiency Trends
Firewall Request Analysis
Total 926 requests across all workflows: 487 allowed (53%), 439 blocked (47%).
Top blocked workflows:
"-"domain patternThe
"-"domain appearing in blocked list is a known Serena MCP local socket artifact (see issue #18388).View Behavioral Patterns
Productive Patterns ✅
Problematic Patterns⚠️
mode: remotewas supposed to fix this (2026-02-24), but 1/3 runs today missing GitHub MCP again. Silent failures — moderation trigger runs but does nothing. Impact: ~33% of moderation events missed.Ecosystem Coverage Assessment
Recommendations
High Priority
Investigate AI Moderator GitHub MCP reliability — 3rd incident in a week
mode: remoteis not a reliable fixmode: localif remote unavailable, or alert on noop runsChroma Issue Indexer token usage investigation — 3.6M tokens is a new high
Medium Priority
Semantic Function Refactoring cost — Slight improvement ($3.97) but still high
Lockdown P0 escalation — All programmatic fix paths closed ([P1] Lockdown mode failing: GH_AW_GITHUB_TOKEN not configured — 5 workflows affected #17414, [q] fix(workflows): remove explicit lockdown:true to stop recurring failures #17807 both "not_planned")
Low Priority
Trends (7-day)
Actions Taken This Run
agent-performance-latest.mdin shared repo memoryshared-alerts.mdwith AI Moderator regression and Chroma concernWarning
This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.