Fix aw-portfolio-yield agent job by switching to the OTel queries skill#31470
Conversation
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
aw-portfolio-yield agent job by switching to the OTel queries skill
There was a problem hiding this comment.
Pull request overview
This PR fixes the failing aw-portfolio-yield agent workflow by switching its shared import from the legacy OTel observability module (which provisions MCP gateway OTLP wiring) to the newer “OTel queries” guidance/skill, so the workflow retains telemetry interpretation instructions without generating broken gateway config.
Changes:
- Replaced
shared/otel-observability.mdwithshared/otel-queries.mdin theaw-portfolio-yieldworkflow source. - Updated the workflow prompt contract to reference the OTel queries skill instead of an
otelMCP server. - Recompiled
aw-portfolio-yield.lock.ymlso the generated workflow no longer includes OTLP env/secrets, OTel MCP server config, or MCP gateway OpenTelemetry wiring.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/aw-portfolio-yield.md | Switches the shared import to OTel queries and updates agent instructions accordingly. |
| .github/workflows/aw-portfolio-yield.lock.yml | Compiled workflow output reflecting the new import (removes OTLP/MCP OTel wiring and updates generated workflow details). |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 2
|
@copilot Recompile workflow |
Ran |
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /diagnose as this is a bug fix PR. The fix is directionally correct — switching from shared/otel-observability.md to shared/otel-queries.md cleans up the OTel dependency and removes the OTLP_ENDPOINT/OTLP_TOKEN secrets that were previously required.
Key Themes
GH_AW_VERSION: devin production lock file: The recompilation was done with a local development build (dev) rather than a tagged release. This means the live workflow will pick up a non-deterministic dev build at runtime — a production correctness concern.- Broad diff scope: The lock file diff includes several unrelated improvements (better node detection,
AWF_REFLECT_ENABLED,chmod a+rX,maxdepth 4→5, removal of the observability summary step). These are compiler improvements being baked in as a side effect of the recompile, which is fine, butGH_AW_VERSION: devstands out as an artifact that should not be in a release lock file. - Root cause not documented (per
/diagnosePhase 6 — state which hypothesis was correct in the commit/PR): The PR description is a task checklist rather than an explanation of why the lock file was stale or what originally caused the agent job failure. Future debuggers won't know whether this was a missedmake recompile, a compiler rollback, or something else.
Positive Highlights
- ✅ Secrets removed cleanly:
OTLP_ENDPOINTandOTLP_TOKENno longer appear in the lock file manifest — the surface area for credential exposure is reduced. - ✅ Better node-not-found error handling in the compiled agent bootstrap — the new
exit 127path with a helpful message is an improvement over silently falling back to a barenodestring. - ✅ The
shared/otel-queries.mdskill is a more focused, purpose-built import for this workflow's needs.
Verdict
Requesting changes specifically for the GH_AW_VERSION: dev value in the lock file — this should be a tagged version before merging to keep production workflows reproducible.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 10.1M
| @@ -767,7 +735,7 @@ jobs: | |||
| GH_AW_PHASE: agent | |||
There was a problem hiding this comment.
[/diagnose] GH_AW_VERSION is set to dev in the compiled lock file. This will cause the production workflow to pull a development build of gh-aw at runtime rather than a pinned release.
This appears to be an artifact of recompiling with a local development build instead of a tagged release. Before merging, recompile with a tagged version (e.g. v0.71.5 or the latest release) so the lock file reflects a stable, reproducible runtime.
Running a production workflow against dev makes the execution environment non-deterministic across runs.
aw-portfolio-yieldwith the repository toolchain and minimize unintended diff changes