Skip to content

Extend azsdk_typespec_generate_authoring_plan with SDK breaking change detection#15077

Closed
samvaity wants to merge 26 commits into
mainfrom
savaity/sdk-breaking-patterns
Closed

Extend azsdk_typespec_generate_authoring_plan with SDK breaking change detection#15077
samvaity wants to merge 26 commits into
mainfrom
savaity/sdk-breaking-patterns

Conversation

@samvaity

@samvaity samvaity commented Apr 10, 2026

Copy link
Copy Markdown
Member

Summary

Extends the TypeSpec authoring plan tool (azsdk_typespec_generate_authoring_plan) to detect SDK breaking changes during TypeSpec authoring — before merge.

Today: Write TypeSpec → merge → SDK generation fails or produces breaking changes → weeks of back-and-forth
After: Write TypeSpec → authoring plan warns about SDK impact → dev applies client.tsp mitigations in the same session

What This PR Does

  1. New: eng/common/knowledge/sdk-breaking-patterns.md (539 lines)

    • 20 breaking change patterns with detection rules
    • Per-language impact matrix (Java/.NET/Python/Go)
    • Language-scoped client.tsp mitigations with concrete code examples
    • Covers simple (model rename) to complex (enum split) cases
  2. Updated: TypeSpecAuthoringTool.cs (+1 line)

    • Added SDK IMPACT warning description to the MCP tool metadata
    • The tool's existing single HTTP call to the KB service is unchanged
  3. Updated: Knowledge sync config + QA bot prompts

    • knowledge-config.json: Registers sdk-breaking-patterns.md for KB indexing
    • qa.md / prompt.go: Updated system prompts to include SDK breaking change context

How It Works

The authoring plan tool already sends one HTTP POST to the KB service (IAzureSdkKnowledgeBaseService). Breaking change patterns are delivered via server-side RAG retrieval — the KB service indexes sdk-breaking-patterns.md and retrieves relevant patterns when the user's request matches. Zero additional client-side calls. No SDK repo needed — works entirely from TypeSpec changes + patterns.

This is Layer 1 (authoring-time detection) of a multi-layer approach:

Layer When Tool What it catches
1. Pattern-based (this PR) Authoring time azsdk_typespec_generate_authoring_plan via KB/RAG Known SDK-breaking patterns — zero overhead
2. Build errors (existing) SDK generation time generate-sdk-locally skill orchestration Compilation-breaking changes in packages with customizations
3. Changelog-based (future) Pipeline time azsdk_sdk_breaking_change_detect (proposed) ALL breaking changes including behavioral and mgmt packages

Layer 1 is proactive and free. Layer 3 is the comprehensive safety net. They complement each other.

Key Design Decisions

  • KB/RAG over local file loading — Patterns reach the AI via server-side retrieval, not client-side file loading. This was a deliberate choice (Crystal removed the earlier TryLoadBreakingPatternsAsync approach).
  • Zero overhead — Same single HTTP call. No additional computation, git operations, or compilation steps.
  • Self-filtering — Patterns only match changes to existing TypeSpec (renames, removals, type splits). New-from-scratch specs don't trigger warnings.
  • Soft rollout — Warnings + mitigation plan first. client.tsp mitigations are standard practice.

Related

Catalogs 10 known TypeSpec change patterns that cause SDK breaking changes:
- Per-language impact matrix (Java/.NET/Python/Go)
- Language-scoped client.tsp mitigations with concrete code examples
- Detection sources (TypeSpec diff vs SDK changelog)
- Complex enum split case with multi-step decorator guidance
- Language-specific breaking change policies

Used by azsdk_typespec_generate_authoring_plan via RAG to warn developers
about SDK impact during TypeSpec authoring, before merge.

Ref: #15069, #13972, #14675

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions Bot added the azsdk-cli Issues related to Azure/azure-sdk-tools::tools/azsdk-cli label Apr 10, 2026
@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

…e detection

- Load sdk-breaking-patterns.md from eng/common/knowledge/ and inject as
  additional context so the AI agent includes SDK IMPACT warnings when
  planned TypeSpec changes match known breaking patterns
- Add optional sdkChangelog parameter for deeper detection from SDK changelog
- Add IGitHelper dependency for repo root discovery (same pattern as
  TypeSpecCustomizationService)
- Update tool description to mention SDK breaking change detection
- Build succeeds with 0 errors

Ref: #15069, #13972, #14675

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@samvaity samvaity force-pushed the savaity/sdk-breaking-patterns branch from 6e4401c to cb61d4e Compare April 10, 2026 18:50
The most common failure in enum split mitigation is forgetting to rename
the original enum out of the way before creating a replacement with the
same @@clientName. This was the exact gap found in Storage testing
(Issue #14675) — tool got steps 2-4 right but missed step 1.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

… Java, Python, Go, JS

Added 10 new patterns (total 20) sourced from official per-language breaking
change guides:
- Enum value name changes (numeric names, Pattern 3)
- Client name changes (Pattern 13)
- Resource base type changes, .NET-specific (Pattern 14)
- WirePathAttribute missing, .NET MPG-specific (Pattern 15)
- Common types upgrade - accept (Pattern 16)
- Unreferenced models removed - accept (Pattern 17)
- Multi-level unflattened properties, Python-specific (Pattern 18)
- Property name conflicts with base methods, Python-specific (Pattern 19)
- LRO/Paging operation type changes, Go-specific (Pattern 20)
- Enhanced operation signature pattern with Python @@OverRide example (Pattern 10)

Enhanced Language-Specific Breaking Change Policies section with per-language
references, customization mechanisms, and known accept-patterns.

Key additions:
- Go: many breaking changes CANNOT be resolved via client.tsp
- .NET: @@markAsPageable, @@hierarchyBuilding, WirePathAttribute via tspconfig
- Python: @@OverRide for parameter reorder, accept patterns for common types
- JS: three-way merge workflow, TypeSpec-first recommendation
- Per-language changelog pattern examples for each pattern

Sources:
- .NET: azure-sdk-for-net/.github/skills/mitigate-breaking-changes/SKILL.md
- Java: azure-sdk-for-java/docs/contributor/typespec-quickstart.md
- Python: azure-sdk-for-python/doc/dev/mgmt/sdk-breaking-changes-guide.md
- Go: azure-sdk-for-go/documentation/development/breaking-changes/sdk-breaking-changes-guide.md
- JS: aka.ms/azsdk/js/customization

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

…plan

Breaking change detection via SDK changelog will be handled by the separate
azsdk_package_detect_breaking_change tool (Crystal's PR #15248) in the
CI/release pipeline stage. The authoring plan tool focuses on pattern-based
detection using sdk-breaking-patterns.md during TypeSpec authoring.

Removed:
- sdkChangelog parameter from GenerateTypeSpecAuthoringPlan MCP tool
- --sdk-changelog CLI option
- Changelog injection into AdditionalInfo
- Changelog reference in tool description

Kept:
- TryLoadBreakingPatternsAsync() for loading sdk-breaking-patterns.md
- IGitHelper dependency for repo root discovery
- SDK breaking change detection via pattern matching (no changelog needed)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@chunyu3 chunyu3 force-pushed the savaity/sdk-breaking-patterns branch from bb1a9a2 to ffd22e0 Compare April 24, 2026 10:01
@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk

Copy link
Copy Markdown
Collaborator

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@haolingdong-msft

haolingdong-msft commented Apr 30, 2026

Copy link
Copy Markdown
Member

Hey @samvaity and @chunyu3 , thanks for the pr! while I like detecting breaking change earlier, after discussing with Shanghai side in the scrum meeting, we may have some concerns about putting the breaking detection and mitigation into each individual authoring task. /cc @ArthurMa1978 @josefree @lirenhe
Reasons :

  1. It will be incorrect, eg. a user input a prompt to create a tracked resource. Then input prompt to change the resource to extension resource. The second prompt can be identified as breaking changes as extension resource is lack of two properties from tracked resource. What we need to do is compare code diff against latest stable, not only get the code change in user’s current prompt/task.
  2. It’s a bit heavy to integrate it into every user’s authoring prompt/task and can impact performance.
  3. We should be careful of providing breaking change mitigation plan to user in authoring stage as we don’t want to help user on making breaking changes and mitigate it.


## Pattern Catalog

### 1. Enum Changed to Extensible Union

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We’ve discussed this before: on the TypeSpec side, we should not allow this kind of change. Instead of modifying an existing API, we should use version decorators to deprecate the old version and introduce a new one.
So this pattern should be invalid and not allowed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, yes, we can use version decorate to insure that extensible union only start from a special version, as following:
before

enum DocumentFormulaKind {
  @doc("A formula embedded within the content of a paragraph.")
  inline: "inline",

  @doc("A formula in display mode that takes up an entire line.")
  display: "display",
}

after:

union DocumentFormulaKind {
  @doc("A formula embedded within the content of a paragraph.")
  inline: "inline",

  @doc("A formula in display mode that takes up an entire line.")
  display: "display",

  @added(Versions.v2026_01_30)
  string,
}

The pattern still be enum -> union

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 6245127 — added a note that this change should use version decorators (@added) to introduce the union in a new API version. Also updated .NET impact to ❌ Breaking with a note that @@alternateType may not fully resolve it for .NET and customization code (partial classes) may be needed.

Comment thread eng/common/knowledge/sdk-breaking-patterns.md Outdated

**Per-Language Impact:**
- **Java:** ❌ Breaking — enum ordinal values change, deserialization breaks
- **.NET:** ⚠️ May break — strong typing affected, but extensible enums are handled

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking for .Net and can't be fixed by client.tsp

@azure-sdk-automation

Copy link
Copy Markdown
Contributor

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk-automation

Copy link
Copy Markdown
Contributor

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk-automation

Copy link
Copy Markdown
Contributor

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@chunyu3

chunyu3 commented Jun 2, 2026

Copy link
Copy Markdown
Member

I re-run Benchmark with/without extending azsdk_typespec_generate_authoring_plan with the latest vally in company desktop:

We ran the existing 21 authoring tasks using two versions of the authoring plan—one with the extension and one without. The time cost of invoking the MCP tool shows no significant difference between the two. Extending azsdk_typespec_generate_authoring_plan with SDK breaking change detection does not introduce any noticeable performance overhead.

Testcase Expand Time (s) with-extend Expand Time (s) without-extend
001001-version-spread-property 32.753 25
001002-version-default-value 26.144 24
001003-version-required-to-optional 26.093 24
001004-version-property-decorator 39.404 27
001011-version-model-property-renamed 29.238 23
001012-version-operation-return-type-changed 34.963 30
001013-version-model-property-type-changed 13.819 22
002001-ARM-change-resource-type 36.379 38
002002-ARM-define-extension-resource 37.636 38
002003-ARM-define-full-update-operation 32.663 32
002004-ARM-define-extension-resource 41.764 38
002005-ARM-define-the-resource 32.181 43
002006-ARM-define-child-resource 29.583 41
002007-ARM-define-custom-action 40.406 34
002008-ARM-add-parameters 35.683 39
002009-arm-add-patch-operation-to-resource 38.328 38
002010-arm-action-sync-operation 8.860 29
003001-arm-action-lro 32.667 29
003002-arm-modify-response 27.785 28
004001-decorate-mgmt-resource-name-parameter 22.950 24
004002-decorate-length-constrains-on-array-item 23.066 20

@haolingdong-msft

haolingdong-msft commented Jun 4, 2026

Copy link
Copy Markdown
Member

Thanks @Sameeksha and @chunyu3 for the PR!
As discussed before. My concerns are around performance, user experience and design.

  1. Thanks @chunyu3 for addressing the performance one.
  2. For user experience, previously @lirenhe was asking for a demo or PoC to see how the authoring experience will be like after adding this. (@lirenhe please correct me if I'm wrong) I'm also curiously about this part. Would be great if we could have a short demo either in a meeting or a video.
  3. For design, just would like to confirm the requirement on front end (eg. authoring skill) side, does it need to pass in all of user's local changes to the KB backend or just the current prompt?

@samvaity samvaity marked this pull request as ready for review June 8, 2026 23:41
@samvaity samvaity requested a review from JiaqiZhang-Dev as a code owner June 8, 2026 23:41
Copilot AI review requested due to automatic review settings June 8, 2026 23:41
@samvaity samvaity requested review from a team as code owners June 8, 2026 23:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the TypeSpec authoring-plan workflow to proactively surface SDK breaking-change risks by (1) adding a breaking-change pattern knowledge document to the knowledge base and (2) updating the QA bot / authoring prompts and tooling so the plan can incorporate SDK-impact guidance (including using a local TypeSpec diff as additional context).

Changes:

  • Added a new shared knowledge document (eng/common/knowledge/sdk-breaking-patterns.md) intended for KB/RAG retrieval to drive SDK-impact warnings and mitigations.
  • Updated knowledge-sync config and TypeSpec authoring prompts to include the breaking-change pattern content and require an explicit “SDK breaking changes checked” step.
  • Updated azsdk_typespec_generate_authoring_plan implementation to accept --target-branch and attach a local git diff of the TypeSpec project as additional context.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
tools/sdk-ai-bots/azure-sdk-qa-bot-knowledge-sync/config/knowledge-config.json Adds eng/common/knowledge as an indexed knowledge source for the KB.
tools/sdk-ai-bots/azure-sdk-qa-bot-backend/service/prompt/prompt.go Extends prompt {{include ...}} handling to support blob: includes (KB content).
tools/sdk-ai-bots/azure-sdk-qa-bot-backend/service/prompt/azure_typespec_authoring/qa.md Makes SDK breaking-change detection mandatory and includes the breaking-patterns blob document.
tools/azsdk-cli/Azure.Sdk.Tools.Cli/Tools/TypeSpec/TypeSpecAuthoringTool.cs Adds --target-branch and sends a local TypeSpec diff to the KB request as additional info.
tools/azsdk-cli/Azure.Sdk.Tools.Cli/Helpers/GitHelper.cs Adds GetDiffAsync helper used to retrieve the diff for the authoring tool.
eng/common/knowledge/sdk-breaking-patterns.md New knowledge-base content: breaking change patterns + mitigations (used for RAG).
.github/skills/azure-typespec-author/SKILL.md Updates the skill workflow to require surfacing SDK breaking-change warnings/mitigations from the plan output.

Comment thread tools/sdk-ai-bots/azure-sdk-qa-bot-backend/service/prompt/prompt.go
Comment thread tools/sdk-ai-bots/azure-sdk-qa-bot-backend/service/prompt/prompt.go
Comment thread tools/sdk-ai-bots/azure-sdk-qa-bot-backend/service/prompt/prompt.go
Comment thread tools/sdk-ai-bots/azure-sdk-qa-bot-backend/service/prompt/prompt.go
- Exact kind of changes to make (operations/models/decorators/versioning)
- Expected impact (breaking vs non-breaking)
- Diff outline (high level, no code):
- SDK Breaking Changes and Mitigation (REQUIRED if SDK breaking changes are detected in **DK Breaking Change Detection Process**):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This typo issue should be solved

Comment thread tools/azsdk-cli/Azure.Sdk.Tools.Cli/Helpers/GitHelper.cs
Comment thread eng/common/knowledge/sdk-breaking-patterns.md
@azure-sdk-automation

Copy link
Copy Markdown
Contributor

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

chunyu3 and others added 2 commits June 10, 2026 11:04
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@azure-sdk-automation

Copy link
Copy Markdown
Contributor

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@azure-sdk-automation

Copy link
Copy Markdown
Contributor

The following pipelines have been queued for testing:
java - template
java - template - tests
js - template
net - template
net - template - tests
python - template
python - template - tests
You can sign off on the approval gate to test the release stage of each pipeline.
See eng/common workflow

@chunyu3 chunyu3 requested review from chunyu3 and lirenhe June 11, 2026 05:34
return "failed to download the blob", err
}
} else {
return "No blob path found", err

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since err is null here, "No blob path found" will be included in the prompt. Should we change it to return "", fmt.Errorf("failed to download blob %q: %w", blobName, err)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help update the changelog and version for chat bot backend?

@samvaity

Copy link
Copy Markdown
Member Author

Closing this PR, in lieu of #15248 (comment)

@samvaity samvaity closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AzSDK Tools Agent Issue related to the AzSDK Tools Agent. azsdk-cli Issues related to Azure/azure-sdk-tools::tools/azsdk-cli

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants