Adding status badges by scbedd · Pull Request #2 · Azure/azure-sdk-tools

scbedd · 2019-02-15T01:03:27Z

No description provided.

scbedd · 2019-02-15T01:08:27Z

Why are checks not appearing here?

Initial processor testing

…#2259) This is attempt #2. Failure to patch is blocking the other PR. If I can just simplify the commits and have it work, that's definitely the easiest way.

…ccuracy-test-1 Wanl/resolve issue foud in accuracy test 1

remove sources and TopK parameter

…cache portability - README/eval comments: evals/unit -> evals/tools, evals/scenarios -> evals/workflow-scenarios (Copilot C1/C5) - Validate-EvalTools.ps1: default EvalPath -> evals/tools; return 1 -> exit 1 so CI fails loudly (Copilot C2/C3) - MCP build output: dotnet build -o artifacts/mcp/{cli,mock}; pipeline switched to Release; .vally.yaml no longer hardcodes Debug/net8.0 (Praveen #1/#2) - ensure-specs-clone.ps1 + workflow evals: repo-relative artifacts/specs-cache path instead of C:/Users/gaoh; Vally resolves it relative to the eval file so it works for all contributors + CI (Copilot C6/C7, Praveen #4) - add-arm-resource/rename-client-property: comment clarifying 'edit' is the Copilot SDK built-in file tool, not an MCP tool (Praveen #5)

… Vally (#15124) (#15811) * Scaffold Azure.Sdk.Tools.Vally tool-scenario eval suite (#15124) Adds a new Vally eval suite under tools/azsdk-cli/Azure.Sdk.Tools.Vally/ for MCP tool / scenario evaluations, replacing the deleted Azure.Sdk.Tools.Cli.Benchmarks project (#15697). - README documents project intent, layout, local run instructions, and how to add a new scenario. - .vally.yaml wires the azsdk-mcp environment (stdio dotnet run against Azure.Sdk.Tools.Cli) and defines 'typespec' and 'all' suites. - evals/check-public-repo.eval.yaml is the first ported scenario (from the deleted CheckPublicRepoScenario): verifies the agent invokes azsdk_typespec_check_project_in_public_repo for a public-repo check prompt. Lints clean via 'vally lint --eval-spec'. - fixtures/.gitkeep reserves the per-scenario fixtures layout. Remaining scenarios from the deleted benchmark are tracked as a checklist in the project README and in #15124. * Port remaining 9 benchmark scenarios to Vally (#15124) Adds eval YAMLs for every scenario that was deleted from Azure.Sdk.Tools.Cli.Benchmarks in #15697: - check-public-repo-then-validate - validate-typespec - typespec-generation-step02 - get-modified-typespec-projects (stub — needs git-repo fixture / setup hook) - add-arm-resource (stub — needs fixtures + npx tsp compile post-check) - create-release-plan - link-namespace-approval-issue - get-pr-link-current-branch - check-sdk-generation-status Each eval uses the built-in tool-calls grader for presence checks; the original benchmark's argument/order/forbidden/optional assertions are captured in prompt text + inline TODOs (require custom graders or upstream Vally support, documented in README). Also adds release-plan/github/pipeline suites to .vally.yaml. All 10 evals pass 'vally lint --eval-spec'. * Add rename-client-property stub eval to Vally suite (#15124) Ports the deleted RenameClientPropertyScenario as a tool-calls-only stub. Full expected-diff grading + sparse-clone setup hook are tracked as follow-ups in the README. * Fix tool name prefix in graders, timeout format, expand README * Reorganize evals into scenarios/ and triggers/; port trigger evals from #15183 - Move 11 multi-step scenario evals to evals/scenarios/ - Port 9 per-tool trigger evals from jeo02/migrate-evaluations-to-vally (PR #15183) to evals/triggers/, stripped azure-sdk-mcp- prefix from graders to match bare MCP tool names - Port Validate-EvalTools.ps1 to scripts/, retargeted at evals/triggers/ with bare-name regex - Update .vally.yaml suites for new layout (scenarios, triggers, all) - Update README to document the split and per-trigger-file tool coverage - Add .gitignore for vally-results/ and results/ * update the config and use gpt-5.4 model * add disallowed * Vally: restructure evals into unit/integration/e2e test pyramid Replace per-area folders (scenarios/, triggers/) with tier-based folders. Feature area moves to a YAML tag, enabling tag-filtered suites. Add composite suites (pr-gate, nightly) and area-filtered suites in .vally.yaml. Update Validate-EvalTools.ps1 to scan evals/unit for triggers-*.eval.yaml. Refresh README and Run-LiveEvals.ps1 paths. * Vally: remove Run-LiveEvals.ps1 (local-only test wrapper) Drop the local-only convenience wrapper and refer directly to evals/setup/ensure-specs-clone.ps1 in docs and YAML comments. Users prime the spec clone manually and invoke 'vally eval --suite e2e'. * some docs and test e2e one * update docs * udpate design * update with skill evals * reorg based on the design * remove the duplicates * add new scenarios * update the doc * update doc * update names * Vally: align release-planner mock stimuli with live e2e pattern All 5 release-planner mock stimuli now use environment.git worktree pointing at the per-user azure-rest-api-specs cache (matching the live e2e fixture), plus a structured e2e-style prompt that supplies the Contoso fixture IDs the mock handlers expect (TypeSpec project, service/product tree IDs, work-item ID 29262). Also document the --skill-dir requirement and worker-cap caveat in README, and fix one stale path in .vally.yaml comment. * update doc * Vally: fix MCP boot race + drop misconfigured grader (#15948) - Launch pre-built DLLs via 'dotnet <dll>' in both .vally.yaml files instead of 'dotnet run', so N parallel workers no longer race on Roslyn's exclusive write lock for the output DLL. - Add 'Build MCP servers' step to eng/pipelines/skill-eval.yml so the CI runner has the DLLs ready before vally starts. - Drop the skill-invocation grader from generate-sdk-for-existing-release-plan (no preflight reasoning step required; tools-only). - Strip 'I'm in a checkout of azure-rest-api-specs.' preamble from prompts; the worktree already provides that context. - Remove stray '// tools skills response' artifact in live release-planner.eval.yaml. - README: document 'dotnet build' as a prereq; rewrite workers warning. Validated: scenarios-mock at --workers 6 -> 5/5 stimuli pass, 0 race hits, ~4 min. * update readme for runing steps * Vally: align mock release-planner grader with live + deterministic 'not found' lookup The create-release-plan-and-generate-sdk mock stimulus required the agent to call azsdk_update_sdk_details_in_release_plan, but neither the prompt nor the azsdk-common-prepare-release-plan skill's create flow asks for it. The agent correctly skipped the tool, and the grader flapped. The dedicated update-sdk-details-in-release-plan stimulus already covers that tool with an explicit prompt. Drop it from the create+generate grader so mock matches the live release-planner-e2e contract (create / get / generate / link). Also patch GetReleasePlanForSpecPrHandler to return a deterministic 'not found' response (ReleasePlanDetails = null). The mock previously returned a 'plan exists' result for any spec PR, pushing the agent down the update path instead of the create path that the stimulus exercises. Stimuli that target an existing plan pass the work-item ID directly and call azsdk_get_release_plan, so this is safe. * update eval yaml * Address PR #15811 review: fix stale paths, exit codes, build output, cache portability - README/eval comments: evals/unit -> evals/tools, evals/scenarios -> evals/workflow-scenarios (Copilot C1/C5) - Validate-EvalTools.ps1: default EvalPath -> evals/tools; return 1 -> exit 1 so CI fails loudly (Copilot C2/C3) - MCP build output: dotnet build -o artifacts/mcp/{cli,mock}; pipeline switched to Release; .vally.yaml no longer hardcodes Debug/net8.0 (Praveen #1/#2) - ensure-specs-clone.ps1 + workflow evals: repo-relative artifacts/specs-cache path instead of C:/Users/gaoh; Vally resolves it relative to the eval file so it works for all contributors + CI (Copilot C6/C7, Praveen #4) - add-arm-resource/rename-client-property: comment clarifying 'edit' is the Copilot SDK built-in file tool, not an MCP tool (Praveen #5) * Refactor Vally tool evals: rename triggers-* to prompt-to-tool-*, consolidate standalone single-tool evals - Rename evals/tools/triggers-*.eval.yaml to prompt-to-tool-*.eval.yaml (Praveen review #6) - Consolidate 7 standalone single-tool scenario evals into the matching namespace files as full-context checks (check-public-repo, check-sdk-generation-status, create-release-plan, get-modified-typespec-projects, get-pr-link-current-branch, link-namespace-approval-issue, validate-typespec) - Keep add-arm-resource.eval.yaml standalone (produces a file edit, not a pure tool trigger) - Switch tool evals to gpt-5.4 and add explicit 'use the available Azure SDK MCP tools' steering plus concrete grounding to bare trigger prompts so they invoke the MCP tool reliably - Update README evals/tools section and Validate-EvalTools.ps1 to the new file names * Remove agent-eval-strategy design spec from PR (now reviewed standalone in #15918) * Drop flaky edit-tool assertion from add-arm-resource eval * remove script * Stabilize flaky tool-scenario prompts and add README command cookbook Ground 13 previously-flaky prompts with concrete IDs/paths so they route deterministically to the intended MCP tool; make the mock check-service-label handler convention-driven (status derived from the requested serviceLabel); document common vally invocation recipes in the README. * Fix outdated command examples in Vally README Replace references to consolidated/non-existent eval files (create-release-plan, check-public-repo, link-namespace-approval-issue) with the real prompt-to-tool-* and workflow-scenario files; correct the default output path to ./vally-results/<timestamp>/; fix the cookbook results.jsonl parser to locate the newest timestamped run; add the missing release-planner-workflows mock scenario to the index. * Fix invalid prompt-grader config in live release-planner eval The prompt (LLM-judge) grader schema uses 'prompt' for the rubric text, not 'rubric'. Rename the field and add 'scoring: binary' (the rubric is pass/fail) so the spec validates.

status badges

eaf304e

removing

b3be1d4

scbedd merged commit e0a515b into Azure:master Feb 15, 2019

arpanlaha added a commit to arpanlaha/azure-sdk-tools that referenced this pull request Jun 14, 2019

Merge pull request Azure#2 from arpanlaha/ts

12b4a43

Initial processor testing

weshaggard mentioned this pull request May 5, 2020

Output script name and args for debugging purposes #569

Merged

richardpark-msft mentioned this pull request Dec 11, 2020

[Javascript smoke tests] Samples smoke tests need more customization #1277

Closed

mitchdenny mentioned this pull request Apr 26, 2021

Bot not applying the right label sometimes for PRs #1099

Closed

scbedd mentioned this pull request Nov 11, 2021

Adding Redirection of nohup-ed tool runs so we can dump the log later #2259

Merged

JoshLove-msft mentioned this pull request Dec 1, 2021

Downgrade to netcoreapp3.1 #2354

Merged

akning-ms mentioned this pull request Mar 23, 2022

Multiple failures in azure-rest-api-specs-pr pipelines #2937

Closed

praveenkuttappan mentioned this pull request Dec 8, 2022

Create API review for CADL from CADL pull request #4924

Closed

This was referenced Jan 19, 2023

Management API Readiness - feedback from experience with service partner and bugs/function not working #5171

Closed

CPEX launch criteria attestation automation requirements #5202

Closed

praveenkuttappan mentioned this pull request Feb 22, 2023

APIView - Orignal and code file are missing for existing revision #5545

Closed

mikeharder mentioned this pull request Apr 5, 2023

TypeSpec Validation PR pipeline check flags issues unrelated to the current PR #5590

Closed

rkmanda mentioned this pull request Jan 12, 2024

[PR Workflow] Remove ARM PR assignee and update guidance #7368

Closed

ladonnaq mentioned this pull request Jun 5, 2024

Improvements to the "create a release plan" experience to ensure process adherence, automate decision-making, and enhance data integrity #6262

Closed

9 tasks

scbedd mentioned this pull request Jun 5, 2024

Add Unified Pipeline APIView Release #8383

Merged

wanlwanl added a commit to albertxavier100/azure-sdk-tools that referenced this pull request Jun 6, 2025

Merge pull request Azure#2 from wanlwanl/wanl/resolve-issue-foud-in-a…

6166380

…ccuracy-test-1 Wanl/resolve issue foud in accuracy test 1

raych1 mentioned this pull request Jul 25, 2025

Review the TSV validation suppression strategy #11342

Open

Copilot AI mentioned this pull request Aug 6, 2025

[azsdk-cli] Add compile-time analyzer to enforce Tool method return types #11565

Merged

chunyu3 pushed a commit to chunyu3/azure-sdk-tools that referenced this pull request Sep 23, 2025

Merge pull request Azure#2 from chunyu3/chatbotmcp

482c2ed

remove sources and TopK parameter

josefree mentioned this pull request Sep 24, 2025

[Release Planner]: Need to update SDK PR after service update the spec #12200

Closed

This was referenced Apr 21, 2026

Extend azsdk_typespec_generate_authoring_plan with SDK breaking change detection #15069

Open

SDK Breaking change detect tool spec #15248

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding status badges#2

Adding status badges#2
scbedd merged 2 commits into
Azure:masterfrom
scbedd:status-badges

scbedd commented Feb 15, 2019

Uh oh!

scbedd commented Feb 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

scbedd commented Feb 15, 2019

Uh oh!

scbedd commented Feb 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant