feat(ci): consolidate Arc canary + prod release into one pipeline#1715
Merged
Conversation
rashmichandrashekar
approved these changes
Jun 16, 2026
Contributor
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
suyadav1
added a commit
that referenced
this pull request
Jun 19, 2026
) * Revert "add canary in arc prod release pipeline (#1715)" This reverts commit 7ea95a2. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Recover ama-logs workspace key from extension protected secret + checksum restart When the extension manager stops re-delivering protectedParameters (e.g. during an extension auto-update), the rendered .Values workspace key becomes empty and the chart would overwrite the live ama-logs-secret KEY with an empty value, breaking the agent ~10-14 days later when cached credentials expire. Changes: - ama-logs-secret.yaml: when the incoming workspace key is empty/placeholder, fall back to the extension manager's on-cluster protected-parameters secret 'protected-ext-parameters-<release>' (data key 'OmsAgent.workspaceKey' for AKS or 'amalogs.secret.key' for Arc). This is the secret the config agent persists from protectedParameters; the agent does not clear it when it drops the CR reference, so it remains the source of truth. Only the KEY is recovered - WSID is non-protected and still delivered. lookup is a no-op on first install (incoming key populated). - ama-logs-daemonset.yaml / ama-logs-daemonset-windows.yaml / ama-logs-deployment.yaml: add checksum/secret annotation to the AKS (non-Arc) Linux and Windows pod templates so pods roll when the effective secret changes. The previous WSID-only annotation did not detect workspace KEY changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Gate workspace-key fallback on non-AAD clusters The protected-parameters key fallback only applies to workspace-key (non-AAD) clusters. On AAD/managed-identity clusters the workspace key is empty by design (auth uses a token, not a key), so recovering a key there is meaningless. Evaluate isUsingAADAuth (AKS) / useAADAuth (Arc) the same way the daemonset/deployment templates do, and skip the lookup + fallback entirely when AAD auth is in use. Validated on a live AKS cluster: with a key injected into protected-ext-parameters-*, isUsingAADAuth=true does NOT recover it (gated), isUsingAADAuth=false does. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Recover workspace key only when missing (non-AAD, broken cluster) Gate the protected-parameters lookup on BOTH non-AAD auth AND the incoming workspace key being empty/placeholder, so the recovery runs only when the mandatory key is missing (the cluster is already broken / about to be). Healthy clusters (key supplied) and AAD clusters skip the live secret read entirely — no unnecessary get-secret API call on the common path. Uses an explicit if/else (not ternary) to pick the active path's incoming key: OmsAgent.workspaceKey for AKS, amalogs.secret.key for Arc. Validated on a live AKS cluster: non-AAD+empty recovers; non-AAD+real key uses the supplied key (lookup skipped); AAD+empty stays empty (gated). helm lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add dummy env vars * remove arc secret changes and refactor * removed checksum/secret --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Madhav Jakkampudi <hejakkam@microsoft.com>
zanejohnson-azure
added a commit
that referenced
this pull request
Jun 23, 2026
Restores the canary stages reverted in #1718 (Stage_Canary_MCR, Stage_Canary_Regions, Wait_After_Canary) so the Arc prod pipeline runs canary -> 25h bake -> manual-gated prod tiers in one pipeline. Includes the fix that the original #1715 was missing: Stage_1 is trigger: manual WITHOUT an explicit dependsOn (ADO rejects dependsOn on manually-triggered stages). Stage_1 still orders after Wait_After_Canary via default sequential dependency.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Consolidates the separate Arc K8s extension canary and prod release pipelines into a single self-contained pipeline by prepending three canary stages to
ci-arc-k8s-extension-prod-release.yaml.Changes
Stage_Canary_MCR(auto) —RELEASE_STAGE_NAME=Canary; packages the local chart and pushes tocanary/stablevia thearc-k8s-extension-Managed-SDPEv2 root.Stage_Canary_Regions(manual) —RELEASE_STAGE_NAME=CanaryStable; registers canary regions via thearc-k8s-extension-release-v2-Managed-SDPEv2 root.Wait_After_Canary— 25h (delayForMinutes: 1500) bake before prod.Stage_1(prod1/stable push) — nowdependsOn: Wait_After_Canaryand gated withtrigger: manualfor extra safety, so prod is never re-pushed without an explicit human start (in addition to the existing in-stageApprovalTask).No existing prod-tier logic was modified beyond the
Stage_1header.Result
One pipeline now runs canary → 25h bake → manual-gated prod tiers, replacing the need to run the canary pipeline separately. Queue it the same way (same
ContainerInsights-MultiArch-MergedBranchesartifact, sameVAR_*variables).Validation
Canary_MCR → Canary_Regions → Wait → Stage_1 → Stage_2 … → Stage_7).ci_prodshows only the single pipeline file.