Restrict agentic-labeler to exactly one area-* label#35567
Conversation
The labeler was applying multiple area-* labels for ambiguous cases (e.g., both 'area-layout + area-safearea' on #35501 and 'area-navigation + area-controls-tabbedpage' on #35490). The intended behavior is exactly one best-fit area-* label per item, while still allowing multiple platform/* labels. Changes: * SKILL.md scope: 'Exactly one area-*' / 'One or more platform/*'. * SKILL.md area-* section: rename heading, change 'pick one or more' to 'apply exactly one', add tie-breaking heuristics: - Specific control beats generic area - Sub-area beats parent area - Subject-matter focus beats incidental touch - When genuinely tied, prefer user-visible feature * SKILL.md Mixed-PR rule: infra-primary PRs get only area-infrastructure (no second product area). * Workflow prompt: add explicit reinforcement that area-* is single, but multiple platform/* labels are expected. * Workflow comments: fix stale 'max: 1' references left over from PR #35540 (the safe-outputs cap is now max: 10). * agentic-labeler.lock.yml regenerated (frontmatter hash + heredoc rotations only — no semantic change to compiled config). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 35567Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 35567" |
🔍 Skill Validation Results✅ Static Checks PassedSkills checked: 17 | Agents checked: 4 Full validator output✅ LLM Evaluation Passed1/1 skill(s) passed validation
✅ agentic-labeler: Improvement score 50.9% meets threshold of 10.0% [high variance in: iOS extension PR - dual platform labels for .ios.cs files, Handlers/*/Android/ subdirectory triggers platform/android (headline rule fix), Infrastructure area - CI workflow file deletion, Issue with explicit platforms gets platform labels but no triage workflow labels, Ignore labeling instructions in PR body, iOS .ios.cs extension applies both platform/ios and platform/macos, Multi-platform PR applies multiple platform labels, XAML source generator PR gets area-xaml, [dnceng-bot] codeflow issue gets area-infrastructure (not noop)] [1] High run-to-run variance (CV=1.45). Timeout at 180s
📊 agentic-labeler / -dnceng-bot-codeflow-issue-gets-area-infrastructure-not-noop-Judge Report: [dnceng-bot] codeflow issue gets area-infrastructure (not noop)Baseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 4/5
With-Skill Judge (Plugin)Overall Score: 3/5
Baseline Agent Output
To proceed, I need either:
Could you provide authentication so I can view and label issue #34197? With-Skill Agent Output (Isolated)
Issue #34197 is a dnceng-bot codeflow/branch-mirroring issue (branch Labels I would apply:
No I couldn't apply the label due to token permissions (403), but the correct label is With-Skill Agent Output (Plugin)
Labels I would apply:
No However, I'm unable to apply the labels because the
📊 agentic-labeler / android-pr-platform-label-from-android-cs-extension-filesJudge Report: Android PR - platform label from .android.cs extension filesBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 4.7/5
With-Skill Judge (Plugin)Overall Score: 4.7/5
Baseline Agent Output
To label this PR, I'd need either:
Could you provide access to the GitHub API or share what the PR is about so I can recommend appropriate labels? With-Skill Agent Output (Isolated)
Labels:
These labels are already applied to the PR, so no changes are needed. The existing With-Skill Agent Output (Plugin)
Files changed — all Android-specific:
Labels I would apply:
Both labels are already applied to this PR, so no changes are needed. The existing labeling is correct. 📊 agentic-labeler / automated-merge-pr-should-noopJudge Report: Automated merge PR - should noopBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 1/5
With-Skill Judge (Plugin)Overall Score: 1/5
Baseline Agent Output
To proceed, I'd need either:
Could you provide access or share the PR context? With-Skill Agent Output (Isolated)
Title: Adds an agentic-labeler workflow ( Files changed: Labels I would apply:
No With-Skill Agent Output (Plugin)
Title: Adds an agentic-labeler workflow ( Files changed: Labels I would apply:
No
📊 agentic-labeler / collectionview-area-with-android-platform-scope-restriction-holds-despite-complex-existing-labels-Judge Report: CollectionView area with Android platform (scope restriction holds despite complex existing labels)Baseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 5/5
With-Skill Judge (Plugin)Overall Score: 4.7/5
Baseline Agent Output
To help you label this PR, I'd need either:
Could you provide access credentials or share the PR context? With-Skill Agent Output (Isolated)
Title: "Revert [Android] Fix CollectionView handler cleanup when DataTemplateSelector switches templates" Changed files: Android-specific CollectionView handler code ( Labels I would apply:
The PR reverts a previous Android CollectionView fix, removing With-Skill Agent Output (Plugin)
Title: "Revert [Android] Fix CollectionView handler cleanup when DataTemplateSelector switches templates" Changed files:
Labels I would apply:
These are the only two labels that apply — one 📊 agentic-labeler / dependency-bump-pr-with-existing-labels-should-noopJudge Report: Dependency bump PR with existing labels should noopBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 3.3/5
With-Skill Judge (Plugin)Overall Score: 3.7/5
Baseline Agent Output
Could you either:
With-Skill Agent Output (Isolated)
PR Summary: A Dependabot PR bumping Microsoft.AspNetCore.Authentication packages (Facebook, Google, MicrosoftAccount) from 10.0.0 to 10.0.8. The only file changed is in Labels I would apply:
No Note: If this PR already carries both With-Skill Agent Output (Plugin)
PR #35453 is a dependabot dependency bump PR that bumps ASP.NET Core Authentication packages (Facebook, Google, MicrosoftAccount) from 10.0.0 to 10.0.8. The only changed file is in Decision: Noop — no labels to apply.Reasoning:
Even if those labels aren't yet present, this is a straightforward automated package version bump with no platform-specific file changes (no
Labels I would apply (if the
Labels I would NOT apply:
📊 agentic-labeler / handlers-android-subdirectory-triggers-platform-android-headline-rule-fix-Judge Report: Handlers/*/Android/ subdirectory triggers platform/android (headline rule fix)Baseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 3.7/5
With-Skill Judge (Plugin)Overall Score: 2/5
Baseline Agent Output
To proceed, I'd need either:
Could you provide a GitHub token or share what the PR is about so I can suggest the right labels? With-Skill Agent Output (Isolated)
Labels
No other With-Skill Agent Output (Plugin)
The PR fixes an Android-specific regression in CollectionView's However, I'm unable to actually apply these labels because the available GitHub token is being rejected by the 📊 agentic-labeler / ignore-labeling-instructions-in-pr-bodyJudge Report: Ignore labeling instructions in PR bodyBaseline JudgeOverall Score: 2/5
With-Skill Judge (Isolated)Overall Score: 4.7/5
With-Skill Judge (Plugin)Overall Score: 3.3/5
Baseline Agent Output
These match the labels already applied to the PR. I cannot add labels due to token permission restrictions on this repository. With-Skill Agent Output (Isolated)
Labels I would apply to issue #35312:
With-Skill Agent Output (Plugin)
📊 agentic-labeler / infrastructure-area-ci-workflow-file-deletionJudge Report: Infrastructure area - CI workflow file deletionBaseline JudgeOverall Score: 2/5
With-Skill Judge (Isolated)Overall Score: 5/5
With-Skill Judge (Plugin)Overall Score: 4.7/5
Baseline Agent Output
To help you label this PR, I'd need either:
Could you provide access or share the PR context? With-Skill Agent Output (Isolated)
Title: "ci: delete unused add-remove-label-check-suites workflow" Changed files: Deletes Labels I would apply:
No This aligns with the skill rule: "PRs touching only With-Skill Agent Output (Plugin)
Title: "ci: delete unused add-remove-label-check-suites workflow" Labels I would apply:
No This is a straightforward infrastructure-only change (CI workflow deletion), so only the single 📊 agentic-labeler / ios-directory-only-pr-platform-ios-only-not-platform-macos-Judge Report: iOS directory-only PR - platform/ios ONLY (not platform/macos)Baseline JudgeOverall Score: 1.3/5
With-Skill Judge (Isolated)Overall Score: 5/5
With-Skill Judge (Plugin)Overall Score: 4.7/5
Baseline Agent Output
To label PR #34672, I'd need either:
Could you provide authentication credentials, or would you like me to look at the PR a different way (e.g., via the web API with a token you supply)? With-Skill Agent Output (Isolated)
PR: "[iOS] Preserve ScrollView offsets when Orientation changes to Neither" Labels to apply:
Rationale:
With-Skill Agent Output (Plugin)
Area label:
Platform label:
Summary
The test files are shared/cross-platform test infrastructure and don't trigger additional platform labels. 📊 agentic-labeler / ios-extension-pr-dual-platform-labels-for-ios-cs-filesJudge Report: iOS extension PR - dual platform labels for .ios.cs filesBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 4/5
With-Skill Judge (Plugin)Overall Score: 3/5
Baseline Agent Output
To help you, I'd need either:
Could you provide authentication, or share the PR details so I can recommend labels? With-Skill Agent Output (Isolated)
Title: "[iOS, Mac] Fix Item spacing not properly applied between items in Horizontal LinearItemsLayout" Changed files: iOS/MacCatalyst CollectionView layout code (UIKit APIs — Labels I would apply:
The PR fixes grouped CollectionView horizontal item spacing by adding With-Skill Agent Output (Plugin)
📊 agentic-labeler / ios-ios-cs-extension-applies-both-platform-ios-and-platform-macosJudge Report: iOS .ios.cs extension applies both platform/ios and platform/macosBaseline JudgeOverall Score: 1.3/5
With-Skill Judge (Isolated)Overall Score: 4/5
With-Skill Judge (Plugin)Overall Score: 2.7/5
Baseline Agent Output
These are the correct labels I would apply:
The PR is already merged and correctly labeled. No additional labels are needed — it's a targeted bug fix for Mac Catalyst menu bar behavior (issue #35279) with no API changes, so no With-Skill Agent Output (Isolated)
Title: Fix for keyboard accelerator key normalization on Mac Catalyst (issue #35279) Changed files:
Labels I would apply:
Summary: 3 labels — With-Skill Agent Output (Plugin)
PR Summary: Fixes issue #35279 — normalizes keyboard accelerator keys to lowercase for Changed files:
Labels I would apply:
📊 agentic-labeler / issue-with-explicit-platforms-gets-platform-labels-but-no-triage-workflow-labelsJudge Report: Issue with explicit platforms gets platform labels but no triage workflow labelsBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 4/5
With-Skill Judge (Plugin)Overall Score: 3.3/5
Baseline Agent Output
To proceed, I'd need either:
Could you provide a GitHub token or paste the issue details? With-Skill Agent Output (Isolated)
Issue #35448: Shell Labels I would apply:
Unfortunately I cannot apply these labels directly because no GitHub token is configured in this environment. To apply them, you can run:
With-Skill Agent Output (Plugin)
Issue #35448 is about Shell Labels I would apply:
📊 agentic-labeler / maccatalyst-pr-applies-platform-macos-only-not-platform-iosJudge Report: MacCatalyst PR applies platform/macos only, not platform/iosBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 4.7/5
With-Skill Judge (Plugin)Overall Score: 4.3/5
Baseline Agent Output
Could you either:
With-Skill Agent Output (Isolated)
Title: Fix DatePicker Opened/Closed events on MacCatalyst Changed files:
Labels I would apply:
The PR fixes With-Skill Agent Output (Plugin)
Title: Fix DatePicker Opened/Closed events on MacCatalyst Changed files:
Labels I would apply:
The PR fixes 📊 agentic-labeler / maps-pr-uses-area-controls-map-not-invented-area-maps-Judge Report: Maps PR uses area-controls-map (not invented area-maps)Baseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 4/5
With-Skill Judge (Plugin)Overall Score: 4/5
Baseline Agent Output
To label this PR, I'd need either:
Could you provide the PR details or set up authentication with With-Skill Agent Output (Isolated)
Summary: This is an Android-specific fix to the Map control's handler lifecycle, improving null-safety when the handler is disconnected or the virtual view is unavailable. With-Skill Agent Output (Plugin)
Title: Fix for Android Map handler disconnect/lifecycle issues Changed files: Android-specific Labels I would apply:
The PR improves null-safety and proper cleanup in the Android 📊 agentic-labeler / multi-platform-pr-applies-multiple-platform-labelsJudge Report: Multi-platform PR applies multiple platform labelsBaseline JudgeOverall Score: 1/5
With-Skill Judge (Isolated)Overall Score: 2/5
With-Skill Judge (Plugin)Overall Score: 2/5
Baseline Agent Output
To label this PR, I'd need either:
Could you provide a GitHub token or share what the PR is about so I can recommend labels? With-Skill Agent Output (Isolated)
Labels to apply:
Note: I cannot apply these labels due to the token access restriction on this enterprise repo. A maintainer or the agentic-labeler workflow would need to apply them. With-Skill Agent Output (Plugin)
|
| File | Rule triggered |
|---|---|
src/Core/src/Platform/iOS/SwitchExtensions.cs |
/Platform/iOS/ directory → platform/ios |
src/Core/tests/.../SwitchHandlerTests.iOS.cs |
*.iOS.cs extension → platform/ios + platform/macos |
The remaining files (Switch.cs, test host app, snapshots) are shared/cross-platform and don't trigger additional platform labels.
`
📊 agentic-labeler / pr-does-not-get-triage-workflow-labels
Judge Report: PR does not get triage workflow labels
Baseline Judge
Overall Score: 1/5
Reasoning: The agent completely failed to accomplish the task. After a single failed gh CLI attempt, it gave up without trying obvious alternatives like using web_fetch to view the PR page on GitHub. The agent should have attempted to retrieve the PR information through the web and then analyzed it to suggest appropriate labels.
- The final label set includes content-derived labels (platform/android for an Android-targeted fix): 1/5 — The agent produced no labels at all. It gave up after a single failed gh CLI attempt without trying alternatives like web_fetch to view the PR page.
- The final label set does NOT include any triage workflow labels (s/needs-*, untriaged, ⌚ Not Triaged) — these are managed by repo automation and human triagers: 3/5 — The agent didn't output any labels at all, so it trivially didn't include bad triage labels. But this is because it failed the task entirely, not because of good judgment.
- The final label set does NOT include any t/, i/, partner/, perf/, backport/, or other non-(area-/platform/) labels — the labeler is restricted to area-/platform/* only: 3/5 — Same as above - no labels were produced at all, so no incorrect labels were included, but only because the task wasn't completed.
With-Skill Judge (Isolated)
Overall Score: 4.7/5
Reasoning: The agent correctly identified both relevant labels (platform/android and area-controls-collectionview) based on PR content analysis. It avoided all prohibited label categories and stayed within the allowed area-/platform/ scope. Despite authentication challenges with the GitHub API, the agent resourcefully used web_fetch to gather the necessary information and produced accurate, well-reasoned labels.
- The final label set includes content-derived labels (platform/android for an Android-targeted fix): 5/5 — The agent correctly identified
platform/androidbased on the Android-specific code changes andarea-controls-collectionviewbased on the CollectionView fix. Both are well-reasoned from the PR content. - The final label set does NOT include any triage workflow labels (s/needs-*, untriaged, ⌚ Not Triaged) — these are managed by repo automation and human triagers: 5/5 — No triage workflow labels were included in the output.
- The final label set does NOT include any t/, i/, partner/, perf/, backport/, or other non-(area-/platform/) labels — the labeler is restricted to area-/platform/* only: 5/5 — Only area-* and platform/* labels were suggested.
With-Skill Judge (Plugin)
Overall Score: 4/5
Reasoning: The agent correctly identified the two most relevant labels for this PR: area-controls-collectionview and platform/android. It overcame authentication issues with the GitHub API by falling back to web_fetch, which showed resourcefulness. The labels are well-justified and appropriate. The only minor issue is that the agent couldn't actually apply the labels due to token restrictions, but it correctly listed what should be applied. The reasoning was clear and the approach was methodical despite the access limitations.
- The final label set includes content-derived labels (platform/android for an Android-targeted fix): 5/5 — The agent correctly identified
platform/androidbased on the [Android] prefix in the title and Android-specific code changes, andarea-controls-collectionviewbased on the CollectionView fix content. - The final label set does NOT include any triage workflow labels (s/needs-*, untriaged, ⌚ Not Triaged) — these are managed by repo automation and human triagers: 5/5 — No triage workflow labels were included in the output.
- The final label set does NOT include any t/, i/, partner/, perf/, backport/, or other non-(area-/platform/) labels — the labeler is restricted to area-/platform/* only: 5/5 — Only area-* and platform/* labels were suggested.
Baseline Agent Output
I don't have aGH_TOKEN` configured in this environment, so I can't access the GitHub API to view or label PR #35457.
To proceed, I'd need either:
- A
GH_TOKENenvironment variable set with appropriate permissions - You to describe the PR content so I can suggest labels
Could you provide a GitHub token or share what the PR is about?
`
With-Skill Agent Output (Isolated)
`
Based on my analysis of PR #35457:
Title: [Android] Fix increasing bottom gap in CollectionView while scrolling
Changed files: Android-specific safe-area inset handling code (imports Android.Content, uses AView, MaterialToolbar, IMauiRecyclerView)
Labels I would apply:
| Label | Reason |
|---|---|
area-controls-collectionview |
Fix targets CollectionView's progressive bottom gap during scrolling (#34634/#34635) |
platform/android |
Android-specific code (Android types, Android platform file) |
The fix prevents stale inset-derived padding on recycled RecyclerView item views by only attaching the inset listener when SafeAreaEdges is explicitly customized on the bound element.
`
With-Skill Agent Output (Plugin)
`
Based on my analysis of PR #35457:
Title: [Android] Fix increasing bottom gap in CollectionView while scrolling
Changed files: Android-specific safe area inset handling code (references AView, Android.Content, IMauiRecyclerView)
Labels I would apply:
| Label | Reason |
|---|---|
area-controls-collectionview |
Fix targets CollectionView scrolling behavior (bottom gap issue #34634/#34635) |
platform/android |
All changed code is Android-specific (Android types, Android platform path) |
Reasoning:
- The specific control (CollectionView) beats the generic area (safe area/layout) per tie-breaking rules.
- The file uses Android-specific types (
AView,Android.Content) confirmingplatform/android. - No other platforms are touched.
I'm unable to apply the labels directly due to the token restriction on this repository. Would you like me to try another approach to apply them?
`
📊 agentic-labeler / shell-area-shell-specific-source-files
Judge Report: Shell area - Shell-specific source files
Baseline Judge
Overall Score: 1/5
Reasoning: The agent completely failed to accomplish the task. While it correctly identified the gh CLI auth issue, it did not try the web_fetch tool which was available and could have retrieved the PR page contents. It produced no labels and instead asked the user for help, which means the task was not completed at all.
- The final label set includes area-controls-shell for Shell-related source files: 1/5 — The agent did not produce any labels at all. It failed to complete the task and did not suggest area-controls-shell or any other label.
- No platform/ labels are applied since only shared cross-platform code is changed*: 3/5 — The agent didn't apply any labels at all, so it technically didn't apply incorrect platform labels. However, this is vacuously true since it failed to complete the task entirely.
With-Skill Judge (Isolated)
Overall Score: 5/5
Reasoning: The agent correctly identified the relevant label (area-controls-shell) and correctly excluded platform labels. It overcame authentication issues with gh CLI by falling back to web_fetch and the diff URL, showing good problem-solving. The final output is clear and well-reasoned.
- The final label set includes area-controls-shell for Shell-related source files: 5/5 — The agent correctly identified and applied the
area-controls-shelllabel based on the PR being about Shell badge property binding between ShellContent and ShellSection. - No platform/ labels are applied since only shared cross-platform code is changed*: 5/5 — The agent correctly noted that only shared cross-platform code was changed and explicitly stated no platform labels should be applied.
With-Skill Judge (Plugin)
Overall Score: 4.7/5
Reasoning: The agent correctly identified the appropriate labels despite authentication issues with the GitHub CLI. It adapted by using web_fetch to get PR details and the patch file, correctly determined the changed files, and applied the right label (area-controls-shell) while correctly omitting platform labels. The approach was slightly inefficient with multiple failed auth attempts but recovered well.
- The final label set includes area-controls-shell for Shell-related source files: 5/5 — The agent correctly identified and applied the area-controls-shell label, with clear reasoning that the changes are in Shell infrastructure files.
- No platform/ labels are applied since only shared cross-platform code is changed*: 5/5 — The agent correctly identified that both changed files are shared cross-platform code and explicitly noted no platform labels should be applied.
Baseline Agent Output
I don't have aGH_TOKEN` configured in this environment, so I can't access the GitHub API to view PR #35462 details.
To label the PR, I'd need either:
- A
GH_TOKENorGITHUB_TOKENenvironment variable set - Or you could share the PR details (title, changed files, description)
Could you provide a GitHub token or share what the PR is about so I can recommend labels?
`
With-Skill Agent Output (Isolated)
`
Based on my analysis of PR #35462:
Title: "Bind badge properties when ShellContent is converted into an implicit ShellSection so badges set directly on ShellContent update the generated section."
Files changed:
src/Controls/src/Core/Shell/ShellSection.cs— shared/cross-platform codesrc/Controls/tests/Core.UnitTests/ShellBadgeTests.cs— shared unit tests
Labels I would apply:
| Label | Rationale |
|---|---|
area-controls-shell |
The fix is specifically in ShellSection.CreateFromShellContent() — Shell infrastructure for badge property binding propagation. |
No platform/* labels — both changed files are shared cross-platform code with no platform-specific file extensions or directory patterns.
`
With-Skill Agent Output (Plugin)
`
Now I have the full picture. Here's my analysis:
PR #35462 — "Fix ShellContent badge propagation"
Changed files:
src/Controls/src/Core/Shell/ShellSection.cs— shared/cross-platform codesrc/Controls/tests/Core.UnitTests/ShellBadgeTests.cs— unit tests
Labels I would apply:
| Label | Rationale |
|---|---|
area-controls-shell |
This PR is Shell-specific (ShellSection, ShellContent badge propagation) |
No platform/* labels — both changed files are shared cross-platform code (no platform suffixes, no platform directories).
Summary:
area-controls-shell— the fix is entirely within the Shell subsystem (ShellSection badge property binding)- No
platform/*labels apply (only shared code touched)
`
📊 agentic-labeler / skill-file-pr-gets-area-infrastructure-not-area-tooling-
Judge Report: Skill-file PR gets area-infrastructure (not area-tooling)
Baseline Judge
Overall Score: 1/5
Reasoning: The agent failed to complete the task. After gh CLI failed due to missing authentication, it gave up immediately without trying obvious alternatives like using web_fetch to view the PR at #34962. It produced no labels and instead asked the user for help.
- The final label set includes area-infrastructure for a PR that only touches .github/skills/: 1/5 — The agent did not produce any final label set. It failed to determine the PR content and did not apply or suggest 'area-infrastructure' as a label.
- The agent prefers area-infrastructure over area-tooling for agent-infra/skill changes: 1/5 — The agent never reached the point of choosing between labels since it could not access the PR details. It also failed to try alternative approaches like web_fetch to view the PR page directly.
With-Skill Judge (Isolated)
Overall Score: 4/5
Reasoning: The agent correctly identified the changed files and applied the appropriate label (area-infrastructure). It handled authentication issues gracefully by falling back to web_fetch. The reasoning was clear and the final answer was well-structured. Minor inefficiency with multiple failed gh CLI attempts before switching to web fetch, but overall solid work.
- The final label set includes area-infrastructure for a PR that only touches .github/skills/: 4.7/5 — The agent correctly identified that the PR modifies .github/skills/code-review/ files and applied the area-infrastructure label. The agent successfully fetched PR details via web_fetch after gh CLI auth issues, confirmed the files changed were skill/infrastructure files, and applied the appropriate label.
- The agent prefers area-infrastructure over area-tooling for agent-infra/skill changes: 4.7/5 — The agent chose area-infrastructure rather than area-tooling, which appears correct for .github/skills/ changes. The agent's reasoning explicitly states these are 'agent/skill infrastructure files' and maps them to area-infrastructure.
With-Skill Judge (Plugin)
Overall Score: 3.7/5
Reasoning: The agent correctly identified the appropriate label (area-infrastructure) for this PR based on the files changed. It showed reasonable problem-solving when the GitHub API access was blocked, falling back to web_fetch. However, it was unable to actually apply the labels to the PR, and admitted this limitation. The label recommendation itself appears correct based on the PR content.
- The final label set includes area-infrastructure for a PR that only touches .github/skills/: 4.7/5 — The agent correctly identified that PR Add Trim/NativeAOT safety rules to code review skill #34962 only modifies files under .github/skills/code-review/ and applied the area-infrastructure label accordingly.
- The agent prefers area-infrastructure over area-tooling for agent-infra/skill changes: 4.7/5 — The agent chose area-infrastructure and explicitly noted the rationale that PRs touching only .github/skills/ should get area-infrastructure. It did not suggest area-tooling.
Baseline Agent Output
I don't have GitHub authentication configured in this environment (GH_TOKEN` is not set), so I can't access PR #34962 or apply labels.
To proceed, I'd need either:
- A
GH_TOKENenvironment variable set, or - You to describe the PR content so I can suggest labels
Could you provide a GitHub token or share what the PR is about?
`
With-Skill Agent Output (Isolated)
`
Based on my analysis of PR #34962:
Changed files: .github/skills/code-review/SKILL.md and .github/skills/code-review/references/review-rules.md
These are skill files under .github/skills/, which per the labeling rules map to CI/agent-infra.
Labels I would apply:
area-infrastructure— The PR exclusively modifies.github/skills/files (code review skill rules), which falls squarely under the infrastructure category for CI/agent-infra files.
No platform/* labels apply — no platform-specific source files are touched.
`
With-Skill Agent Output (Plugin)
`
Based on my analysis of PR #34962:
Title: "Add Trim/NativeAOT safety rules to code review skill"
Changed files:
.github/skills/code-review/SKILL.md.github/skills/code-review/references/review-rules.md
Labels I would apply:
| Label | Rationale |
|---|---|
area-infrastructure |
PR exclusively modifies .github/skills/ files (agent/skill infrastructure), which per the labeling rules maps to area-infrastructure |
No platform/* labels — the changed files are markdown skill definitions, not platform-specific source code.
I'm unable to actually apply the labels due to the GitHub token having access restrictions on this enterprise repository. Would you like me to try another approach to apply them?
`
📊 agentic-labeler / windows-pr-platform-label-from-windows-cs-or-platform-windows-
Judge Report: Windows PR - platform label from .windows.cs or Platform/Windows/
Baseline Judge
Overall Score: 1/5
Reasoning: The agent failed to complete the task. After discovering GH_TOKEN wasn't set, it gave up without trying alternative approaches such as using web_fetch to view the PR on GitHub's web interface. It produced no labels whatsoever, completely failing the core objective.
- The final label set includes platform/windows: 1/5 — The agent did not produce any final label set. It failed to retrieve PR information and did not suggest platform/windows.
- The final label set includes area-controls-collectionview (ItemsViewHandler.Windows.cs is a CollectionView/CarouselView handler): 1/5 — The agent did not produce any final label set and did not identify area-controls-collectionview.
- The agent does NOT apply platform/android, platform/ios, or platform/macos (the PR is Windows-only): 3/5 — The agent didn't apply any labels at all, so it technically didn't apply incorrect platform labels. However, it also didn't complete the task, so this is a neutral outcome rather than a deliberate correct decision.
- The agent does NOT apply partner/syncfusion or any non-(area-/platform/) labels even though those exist on the PR: 3/5 — Same as above - no labels were applied at all, so no incorrect labels were applied, but the task wasn't completed.
With-Skill Judge (Isolated)
Overall Score: 5/5
Reasoning: The agent correctly identified both relevant labels despite authentication issues with the GitHub API. It resourcefully fell back to web_fetch to get PR details, analyzed the changed files, and produced an accurate label set with clear reasoning.
- The final label set includes platform/windows: 5/5 — The agent correctly identified and included platform/windows in its final label set.
- The final label set includes area-controls-collectionview (ItemsViewHandler.Windows.cs is a CollectionView/CarouselView handler): 5/5 — The agent correctly identified and included area-controls-collectionview with good reasoning about ItemsView being CollectionView's base handler.
- The agent does NOT apply platform/android, platform/ios, or platform/macos (the PR is Windows-only): 5/5 — The agent explicitly noted that no other platform labels apply and that no .ios.cs, .android.cs, or .maccatalyst.cs files are touched.
- The agent does NOT apply partner/syncfusion or any non-(area-/platform/) labels even though those exist on the PR: 5/5 — The agent only applied area-controls-collectionview and platform/windows, no extraneous labels.
With-Skill Judge (Plugin)
Overall Score: 5/5
Reasoning: The agent correctly identified both applicable labels and avoided all incorrect ones. Despite some friction with GitHub token authentication, it recovered by using web_fetch to inspect the PR and files, arriving at the correct conclusion efficiently.
- The final label set includes platform/windows: 5/5 — The agent clearly includes platform/windows in its final label set with correct reasoning about the .Windows.cs file extension.
- The final label set includes area-controls-collectionview (ItemsViewHandler.Windows.cs is a CollectionView/CarouselView handler): 5/5 — The agent correctly identifies area-controls-collectionview and provides solid reasoning linking ItemsViewHandler to CollectionView/CarouselView.
- The agent does NOT apply platform/android, platform/ios, or platform/macos (the PR is Windows-only): 5/5 — The agent explicitly states 'No other platform/* labels apply (no iOS/Android/macOS files touched).'
- The agent does NOT apply partner/syncfusion or any non-(area-/platform/) labels even though those exist on the PR: 5/5 — The agent only applies area-controls-collectionview and platform/windows, no extraneous labels.
Baseline Agent Output
I don't have aGH_TOKEN` configured in this environment, so I can't access the GitHub API to read PR #35458 or apply labels.
To label this PR, I'd need either:
- A
GH_TOKENenvironment variable set, or - You to describe what the PR does so I can recommend labels.
Could you provide a GitHub token or share what the PR is about so I can suggest appropriate labels?
`
With-Skill Agent Output (Isolated)
`
Based on my analysis of PR #35458:
Changed files:
ItemsViewHandler.Windows.cs— Windows-specific CollectionView/ItemsView handler fix (deferred scroll logic for grouped sources and Reset notifications)agentic-labeler/tests/eval.yaml— eval scenario improvements
Labels I would apply:
| Label | Reason |
|---|---|
area-controls-collectionview |
The product code change is in the ItemsView handler (CollectionView's base handler) fixing scroll behavior |
platform/windows |
File matches *.Windows.cs pattern |
The eval.yaml changes are secondary/infrastructure supporting the labeler itself, but the dominant subject-matter is the CollectionView Windows fix — so per tie-breaking rules, area-controls-collectionview wins over area-infrastructure.
`
With-Skill Agent Output (Plugin)
`
Based on my analysis of PR #35458:
Changed file: ItemsViewHandler.Windows.cs — a Windows-platform handler for CollectionView dealing with VectorChanged / scroll-into-view logic for grouped sources and Reset notifications.
Labels I would apply:
| Label | Reason |
|---|---|
area-controls-collectionview |
ItemsViewHandler is the handler backing CollectionView |
platform/windows |
File matches *.Windows.cs pattern |
No other platform/* labels apply (only Windows-specific code changed), and the single most specific area-* is CollectionView (not layout or navigation).
`
📊 agentic-labeler / workflow-only-pr-gets-area-infrastructure
Judge Report: Workflow-only PR gets area-infrastructure
Baseline Judge
Overall Score: 1/5
Reasoning: The agent failed to complete the task. After the gh CLI failed due to missing authentication, it gave up immediately without trying alternative approaches like web_fetch to access the PR details from GitHub's web interface. A competent agent would have attempted other methods to retrieve the PR information before declaring failure.
- The final label set includes area-infrastructure for a PR that only touches .github/workflows/: 1/5 — The agent produced no labels at all. It gave up after the gh CLI failed due to missing GH_TOKEN, without trying alternative approaches like web_fetch on the public GitHub PR URL.
- No platform/ labels are applied for a workflow-only PR*: 3/5 — The agent didn't apply any platform labels, but this is vacuously true since it didn't produce any labels at all. It didn't complete the task, so this criterion is only trivially satisfied.
With-Skill Judge (Isolated)
Overall Score: 4.3/5
Reasoning: The agent correctly identified the appropriate label (area-infrastructure) and avoided inappropriate labels (platform/*). It had to work around token access issues but recovered by using web_fetch to get the PR details. The approach was methodical and the final output was well-structured with clear rationale.
- The final label set includes area-infrastructure for a PR that only touches .github/workflows/: 5/5 — The agent correctly identified area-infrastructure as the appropriate label for a PR that only modifies GitHub Actions workflow files.
- No platform/ labels are applied for a workflow-only PR*: 5/5 — The agent explicitly noted that no platform/* labels should be applied since the PR only touches workflow YAML, not platform-specific source files.
With-Skill Judge (Plugin)
Overall Score: 4.3/5
Reasoning: The agent correctly identified the appropriate label (area-infrastructure) and correctly avoided platform labels. It handled authentication failures gracefully by falling back to web_fetch. The only shortcoming is that it couldn't actually apply the label due to token restrictions, but it clearly stated what label should be applied and why, which is what the task asked for ('List the labels you would apply').
- The final label set includes area-infrastructure for a PR that only touches .github/workflows/: 5/5 — The agent correctly identified that the PR only modifies a GitHub Actions workflow file and applied the area-infrastructure label with clear reasoning.
- No platform/ labels are applied for a workflow-only PR*: 5/5 — The agent explicitly noted that no platform/* labels apply since the PR touches only shared workflow infrastructure, not platform-specific source code.
Baseline Agent Output
`
I don't have GitHub CLI authentication configured in this environment, so I can't access the PR details or apply labels.
To help you label PR #35438, I'd need either:
- A
GH_TOKENset in the environment forghCLI access, or - You to share the PR details (title, description, changed files) so I can recommend labels.
Could you provide authentication or share the PR context?
`
With-Skill Agent Output (Isolated)
`
Based on my analysis of PR #35438:
Title: Fix review trigger leading whitespace handling
Content: Modifies a GitHub Actions workflow file (.github/workflows/) to add a pre-filter match job using a bash regex so that the /review slash command works even when preceded by whitespace.
Changed files: Only a .github/workflows/ YAML file.
Labels I would apply:
| Label | Rationale |
|---|---|
area-infrastructure |
PR exclusively modifies GitHub Actions workflow/CI infrastructure |
No platform/* labels — no platform-specific source files are touched.
This is a straightforward infrastructure-only change to the CI workflow that triggers PR reviews.
`
With-Skill Agent Output (Plugin)
`
Based on my analysis of PR #35438:
Title: Fix review trigger leading whitespace handling
Content: The PR modifies a GitHub Actions workflow file (.github/workflows/) to add a pre-filter match job that uses a bash regex to robustly detect /review commands even when preceded by whitespace. It also trims leading whitespace before parsing arguments.
Labels I would apply:
| Label | Rationale |
|---|---|
area-infrastructure |
PR exclusively modifies .github/workflows/ CI infrastructure (GitHub Actions workflow for review triggering) |
No platform/* labels — the changed files are workflow YAML, not platform-specific source code.
However, I'm unable to actually apply these labels because the GitHub token available in this environment is rejected by the dotnet org's enterprise policy (fine-grained PAT lifetime restriction). To apply the labels, you'd run:
bash gh pr edit 35438 --repo dotnet/maui --add-label "area-infrastructure"
`
📊 agentic-labeler / xaml-source-generator-pr-gets-area-xaml
Judge Report: XAML source generator PR gets area-xaml
Baseline Judge
Overall Score: 1/5
Reasoning: The agent failed to complete the task. After discovering that gh CLI didn't work without a token, it did not try the obvious alternative of using web_fetch to view the PR on GitHub's web interface. It gave up after only 3 tool calls and asked the user for help instead of exhausting available options. The task was not accomplished in any meaningful way.
- The final label set includes area-xaml for a XAML source generator issue: 1/5 — The agent did not produce any label set at all. It failed to access the PR and did not attempt alternative methods (like web_fetch to view the PR page) to determine appropriate labels.
With-Skill Judge (Isolated)
Overall Score: 1/5
Reasoning: The agent failed to correctly identify the PR's content despite fetching data that showed XAML source generator test code. It appears to have hallucinated a completely different PR summary (about CI/agent infrastructure) that contradicts what was actually fetched from GitHub. The final label (area-infrastructure) is incorrect - the PR is about XAML source generation and should have been labeled area-xaml.
- The final label set includes area-xaml for a XAML source generator issue: 1/5 — The agent did not include area-xaml in its final label set. The web_fetch of the PR files clearly showed XAML source generator unit test code (NumericBindablePropertyPrimitives : SourceGenXamlInitializeComponentTestBase), but the agent ignored this evidence and only labeled the PR as area-infrastructure.
With-Skill Judge (Plugin)
Overall Score: 1.7/5
Reasoning: The agent struggled to access PR details due to token restrictions and had to rely on web scraping, which gave limited information. It mischaracterized the PR as primarily infrastructure changes and missed the area-xaml label that the rubric requires. The agent also failed to actually apply any labels, only recommending them.
- The final label set includes area-xaml for a XAML source generator issue: 1/5 — The agent did not include
area-xamlin its final label set. The web_fetch of the PR files clearly showed XAML source generator unit test content (NumericBindablePropertyPrimitives in Microsoft.Maui.Controls.SourceGen.UnitTests), but the agent focused only on the CI/infrastructure aspects and labeled the PR solely asarea-infrastructure.
Baseline Agent Output
I don't have aGH_TOKEN` configured in this environment, so I can't access the GitHub API to view PR #35444 or apply labels.
To label this PR, I'd need either:
- A
GH_TOKENorGITHUB_TOKENenvironment variable set, or - To be authenticated with
gh auth login
Could you provide a GitHub token, or would you like me to look at the PR via the web instead? Alternatively, can you share what the PR is about so I can suggest appropriate labels?
`
With-Skill Agent Output (Isolated)
`
Based on my analysis of PR #35444:
PR Summary: Replaces review-rules.md with a dimensional expert review agent (maui-expert-reviewer.md), adds inline PR comment posting scripts, and modifies CI pipeline config. All changes are in .github/ paths (agents, skills, scripts, instructions) and eng/pipelines/ci-copilot.yml.
Labels I would apply:
| Label | Rationale |
|---|---|
area-infrastructure |
CI pipelines, GitHub workflows, agentic-workflow/skill files are the primary subject |
No platform/* labels — the changed files are all CI/agent infrastructure with no platform-specific code (no .android.cs, .ios.cs, .windows.cs, etc.).
This is a clear area-infrastructure case per the labeling rules: "PRs touching only .github/workflows/, .github/skills/, .github/scripts/, eng/pipelines/… → area-infrastructure."
`
With-Skill Agent Output (Plugin)
`
Based on my analysis of PR #35444:
Title: Replaces review-rules.md with a dimensional expert review agent (code-review skill refactoring)
Description: Reorganizes review rules into 30 dimensions for per-dimension sub-agent evaluation. Changes are in .github/skills/, .github/scripts/, and agent infrastructure files. Explicitly states "No functional code changes."
Files: Dominated by .github/skills/ and .github/scripts/ agent infrastructure files. The test file visible (NumericBindablePropertyPrimitives) appears incidental (from merged commits). No platform-specific files.
Labels I would apply:
| Label | Rationale |
|---|---|
area-infrastructure |
PR is dominated by .github/skills/ and .github/scripts/ agent/CI infrastructure changes |
No platform/* labels — no platform-specific files are changed (no .android.cs, .ios.cs, .windows.cs, etc., and no /Platform/<Name>/ directories).
I'm unable to apply labels directly due to token restrictions in this environment. Would you like me to attempt another method, or is the recommendation sufficient?
`
<!-- Please let the below note in for people that find this PR --> > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could [test the resulting artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from this PR and let us know in a comment if this change resolves your issue. Thank you! ## What Restrict the agentic-labeler to apply **exactly one `area-*` label** per item, while still allowing multiple `platform/*` labels. ## Why Backfilling the 26 items affected by the `max:1` bug (fixed in dotnet#35540) revealed that the labeler occasionally applies multiple `area-*` labels for ambiguous cases: - **dotnet#35501** got both `area-layout` and `area-safearea` - **dotnet#35490** got both `area-navigation` and `area-controls-tabbedpage` The intended behavior is exactly one best-fit `area-*` per item (a label-quota distinction not expressible via `safe-outputs.add-labels.max:` — that field counts total labels, not labels per prefix). The fix has to live in the agent's instructions. ## Changes ### `.github/skills/agentic-labeler/SKILL.md` - Scope section: "Exactly one `area-*`" / "One or more `platform/*`". - Area rules section: renamed heading, changed "pick one or more" → "apply exactly one". - New **tie-breaking heuristics** for the area-* selection: - Specific control beats generic area (`area-controls-tabbedpage` over `area-navigation`) - Sub-area beats parent area (`area-safearea` over `area-layout`) - Subject-matter focus beats incidental touch - When genuinely tied, prefer the user-visible feature - Mixed-PR rule clarified: infra-primary PRs get only `area-infrastructure` (no second product area). ### `.github/workflows/agentic-labeler.md` - Added explicit reinforcement in the workflow prompt: "Apply exactly one `area-*` label … and one or more `platform/*` labels". - Fixed two stale `max: 1` comments left over from dotnet#35540 (the cap is now `max: 10`). ### `.github/workflows/agentic-labeler.lock.yml` - Regenerated via `gh aw compile`. Diff is frontmatter-hash + heredoc rotations only — no semantic change to the compiled config. ## Validation - Reviewed all 21 existing eval scenarios in `tests/eval.yaml` — none assert multiple `area-*` labels, so no test updates needed. - The `max: 10` cap in `safe-outputs` is preserved as a blast-radius safeguard (one area + several platforms still fit comfortably). ## Follow-ups (not in this PR) If accuracy of the "one area" rule drops below ~95% in eval runs, consider adding a deterministic post-step that strips extra `area-*` labels per a known precedence list (Option B from the design discussion). Co-authored-by: bot <bot@test> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
<!-- Please let the below note in for people that find this PR --> > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could [test the resulting artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from this PR and let us know in a comment if this change resolves your issue. Thank you! ## What Restrict the agentic-labeler to apply **exactly one `area-*` label** per item, while still allowing multiple `platform/*` labels. ## Why Backfilling the 26 items affected by the `max:1` bug (fixed in dotnet#35540) revealed that the labeler occasionally applies multiple `area-*` labels for ambiguous cases: - **dotnet#35501** got both `area-layout` and `area-safearea` - **dotnet#35490** got both `area-navigation` and `area-controls-tabbedpage` The intended behavior is exactly one best-fit `area-*` per item (a label-quota distinction not expressible via `safe-outputs.add-labels.max:` — that field counts total labels, not labels per prefix). The fix has to live in the agent's instructions. ## Changes ### `.github/skills/agentic-labeler/SKILL.md` - Scope section: "Exactly one `area-*`" / "One or more `platform/*`". - Area rules section: renamed heading, changed "pick one or more" → "apply exactly one". - New **tie-breaking heuristics** for the area-* selection: - Specific control beats generic area (`area-controls-tabbedpage` over `area-navigation`) - Sub-area beats parent area (`area-safearea` over `area-layout`) - Subject-matter focus beats incidental touch - When genuinely tied, prefer the user-visible feature - Mixed-PR rule clarified: infra-primary PRs get only `area-infrastructure` (no second product area). ### `.github/workflows/agentic-labeler.md` - Added explicit reinforcement in the workflow prompt: "Apply exactly one `area-*` label … and one or more `platform/*` labels". - Fixed two stale `max: 1` comments left over from dotnet#35540 (the cap is now `max: 10`). ### `.github/workflows/agentic-labeler.lock.yml` - Regenerated via `gh aw compile`. Diff is frontmatter-hash + heredoc rotations only — no semantic change to the compiled config. ## Validation - Reviewed all 21 existing eval scenarios in `tests/eval.yaml` — none assert multiple `area-*` labels, so no test updates needed. - The `max: 10` cap in `safe-outputs` is preserved as a blast-radius safeguard (one area + several platforms still fit comfortably). ## Follow-ups (not in this PR) If accuracy of the "one area" rule drops below ~95% in eval runs, consider adding a deterministic post-step that strips extra `area-*` labels per a known precedence list (Option B from the design discussion). Co-authored-by: bot <bot@test> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
What
Restrict the agentic-labeler to apply exactly one
area-*label per item, while still allowing multipleplatform/*labels.Why
Backfilling the 26 items affected by the
max:1bug (fixed in #35540) revealed that the labeler occasionally applies multiplearea-*labels for ambiguous cases:area-layoutandarea-safeareaarea-navigationandarea-controls-tabbedpageThe intended behavior is exactly one best-fit
area-*per item (a label-quota distinction not expressible viasafe-outputs.add-labels.max:— that field counts total labels, not labels per prefix). The fix has to live in the agent's instructions.Changes
.github/skills/agentic-labeler/SKILL.mdarea-*" / "One or moreplatform/*".area-controls-tabbedpageoverarea-navigation)area-safeareaoverarea-layout)area-infrastructure(no second product area)..github/workflows/agentic-labeler.mdarea-*label … and one or moreplatform/*labels".max: 1comments left over from Fix agentic-labeler truncating labels to 1 per call #35540 (the cap is nowmax: 10)..github/workflows/agentic-labeler.lock.ymlgh aw compile. Diff is frontmatter-hash + heredoc rotations only — no semantic change to the compiled config.Validation
tests/eval.yaml— none assert multiplearea-*labels, so no test updates needed.max: 10cap insafe-outputsis preserved as a blast-radius safeguard (one area + several platforms still fit comfortably).Follow-ups (not in this PR)
If accuracy of the "one area" rule drops below ~95% in eval runs, consider adding a deterministic post-step that strips extra
area-*labels per a known precedence list (Option B from the design discussion).