Skip to content

docs(hcg-runbook): refresh rollout runbook v0.1→v0.2 after Phase D close (standards#100)#207

Merged
hyperpolymath merged 1 commit into
mainfrom
phase-e/runbook-refresh-post-d-close
Jun 8, 2026
Merged

docs(hcg-runbook): refresh rollout runbook v0.1→v0.2 after Phase D close (standards#100)#207
hyperpolymath merged 1 commit into
mainfrom
phase-e/runbook-refresh-post-d-close

Conversation

@hyperpolymath

Copy link
Copy Markdown
Owner

Summary

Refreshes docs/integration/hcg-tier2-rollout-runbook.md from v0.1 (draft, 2026-05-20, pre Phase-D) to v0.2 reflecting the current state of the single-lane HCG tier-2 channel rooted at standards#91. Documentation-only PR — no code, no infrastructure, no behaviour change. The runbook had visible drift across three sub-sections of §1 (Prerequisites) since Phase D (standards#99) closed on 2026-06-01 and gateway E1 (deploy spec) landed on 2026-06-03; this PR ticks the boxes that have evidence and calls out exactly what's still open.

Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100

NOT Closes #100. Phase E close is owner-driven per the runbook §6.5 single-lane discipline and depends on §3.3 (100% soak), §6.4 (Trustfile flip), and the cerro-torre .ctp signing — none of which this PR touches. Matches the Phase D pattern (#14 / #22 / #26 / #30 all Refs'd #99 before boj-server#168 closed it).

Channel position

standards#91 (parent, open)
├── #96 Phase A — closed
├── #97 Phase B — closed
├── #98 Phase C — closed
├── #99 Phase D — closed (boj-server#168, 2026-06-01)
└── #100 Phase E — IN PROGRESS
     ├── E5 runbook draft — boj-server#128 (landed 2026-05-20)
     ├── E5 runbook v0.2 refresh — THIS PR
     ├── E1 loopback prereqs — boj-server#130/#131/#132/#165/#173 (landed)
     ├── E1 deploy spec — http-capability-gateway#38 (landed 2026-06-03)
     ├── E1 .ctp signing — owner follow-up
     ├── E2 staging cut-over — owner follow-up
     ├── E3 telemetry verification — owner follow-up
     ├── E4 production rollout — owner follow-up
     └── E5 Trustfile flip — owner follow-up (joint-close)

What changed

docs/integration/hcg-tier2-rollout-runbook.md

Header banner.

  • Version 0.1 (draft, Phase E first cut)0.2 (post Phase-D close, Phase E in-progress).
  • Date 2026-05-202026-06-08 (rev. from 2026-05-20).
  • "Phase-D dependency" admonition replaced with "Phase-D status (2026-06-08)" naming the actual close path (boj-server#168 on the gateway-side D-1..D-3 + D-4 bootstrap), the five-scenario harness state, and the two remaining owner-driven follow-ups (workflow dispatch + _status flip). The old admonition still claimed "Phase D has merged the scaffold only" — six weeks stale.

§1.1 Phase D deliverables landed.

  • [ ] D-2, [ ] D-3, [ ] D-4 → [x] lines per phase with PR refs and dates:
    • D-1 harness scaffold — http-capability-gateway#12 (2026-05-20).
    • D-2 loopback backend fixture — http-capability-gateway#14 (2026-05-26).
    • D-3 trust-header-rewrite + mTLS handshake scenarios + schema-drift hardening — http-capability-gateway#22 (2026-05-27), Claude/resume repos migration 9 o2 u1 #30 (2026-06-02).
    • D-4 bootstrap (workflow_dispatch rebaseline on ubuntu-latest) — http-capability-gateway#26 (2026-05-30).
    • Cross-repo D-1 load-profile declaration (this repo's denominator) — boj-server#168 (2026-06-01); joint-closed standards#99.
  • New [ ] line for the remaining open item only: owner-driven dispatch of Perf Rebaseline + maintainer-merge of the generated perf: rebaseline (standards#99) PR + _status: scaffold-placeholder → active flip. Until this lands the gate runs in non-blocking scaffold mode.
  • Trailing blockquote updated: the scaffold-mode gate catches per-scenario tolerance breaches but not absolute load-profile budget breach; the rebaseline + flip arms the absolute check. Replaces the previous claim that "without D-3+D-4 there is no number to match against".

§1.4 BoJ-side prerequisites.

  • [ ] Loopback bind → [x] enumerating the three layers that landed: Elixir Cowboy bind tightening (boj-server#130), k8s Service ClusterIP (boj-server#131), Zig-adapter APP_HOST=127.0.0.1 across stapeln.toml, entrypoint.sh, compose.prod.yaml (boj-server#132). Deployment-time confirmation that the staging port really is closed at the network layer stays an operator pre-check before §2.1.
  • [ ] TrustPolicy clause → [x] with the verified line reference (elixir/lib/boj_rest/trust_policy.ex:73) and PR (boj-server#106).
  • New [x] entries for the two additional Phase-E-supporting BoJ-side landings flagged in gateway#38's channel position:
    • NetworkPolicy hardening (boj-server#173).
    • HCG-policy SSE-route coverage (boj-server#165).
  • Trustfile.a2ml tier_2_gateway.status: PENDING line stays intentionally unchecked — it's the §6.4 last-action target.

§1.5 Gateway-side prerequisites.

  • [ ] container/gateway-deploy.k9.ncl exists → [x] with PR ref (http-capability-gateway#38, 2026-06-03), naming the five-level k9-svc pedigree (Snout / Scent / Leash / Gut / Muscle), per-environment BACKEND_URL, trust-source flip pattern ("header" staging → "mtls" production after §2.4 rehearsal), max_unavailable = 0, and failure_mode = "fail-closed" matching the [SEAMS] gateway-boj-gnosis declaration.
  • [ ] Containerfile + .ctp signing entry extended with a note that pedigree.security.signature + pedigree.validation.checksum stay PLACEHOLDER in the k9.ncl until cerro-torre signing runs (separate operator action, key-handling discipline).
  • Surface-coverage entry adds the PR ref (boj-server#165) for the cartridge-sse-post rule.
  • Gateway smoke-test entry: kept unchecked, but expanded with the concrete allow/deny sequence the test plan in boj-server#165 deferred to this step (one allow + one deny per route, plus the POST /cartridge/:name/sse X-Trust-Level cases).

CHANGELOG.md

New ### Documentation entry under [Unreleased] summarising the refresh and pointing at the PR refs. Sits with the existing Phase E ### Added entry for the NetworkPolicy (#173) and the prior loopback-bind entries.

What this PR deliberately does NOT do

  • Does not flip Trustfile.a2ml tier_2_gateway.status. That's the §6.4 last action; flipping it before the soak windows are complete would mis-represent the deployment state.
  • Does not close standards#100. Same channel discipline as PR chore(deps): bump nixpkgs from 01fbdee to 6368eda #38 / docs(hcg-load-profile): Phase D D1 — load profile declaration (standards#99) #168 — single-lane joint-close, owner-only.
  • Does not change the runbook's structure or substance beyond reconciling the §1 checklist with merged work. Sections 2–6 (staging cut-over, production rollout, observability, rollback, post-rollout) untouched.
  • Does not invent dates or PR numbers. Every cross-reference comes from gh pr view on a merged-and-confirmed PR (verified via the GitHub MCP at preparation time).

Test plan

🤖 Generated with Claude Code


Generated by Claude Code

…ards#91 / #100)

Refreshes docs/integration/hcg-tier2-rollout-runbook.md from v0.1
(draft, 2026-05-20, pre Phase-D) to v0.2 reflecting the current
state of the single-lane channel rooted at standards#91:

- §1.1 Phase D deliverables: tick D-1..D-3 + D-4 bootstrap with
  http-capability-gateway PR refs (#12 / #14 / #22 / #26 / #30) and
  the boj-server D-1 load-profile (#168) that joint-closed standards#99
  on 2026-06-01. The one remaining open item is the owner-driven
  perf-rebaseline workflow dispatch + `_status: scaffold-placeholder
  -> active` flip; called out explicitly rather than left as a stale
  unchecked checkbox.

- §1.4 BoJ-side prereqs: tick the three loopback-bind layers
  (#130 / #131 / #132), the Phase C TrustPolicy clause (#106), the
  NetworkPolicy (#173), and the SSE-route policy coverage (#165).
  The Trustfile `tier_2_gateway.status: PENDING` line stays
  intentionally unchecked - it's the §6.4 last-action target.

- §1.5 Gateway-side prereqs: tick the new
  `container/gateway-deploy.k9.ncl` from http-capability-gateway#38
  (2026-06-03), record what stays PLACEHOLDER until cerro-torre
  signing runs, and expand the smoke-test entry with the concrete
  allow/deny sequence boj-server#165 deferred.

- Header banner: replace the stale "Phase D has merged the scaffold
  only" Phase-D-dependency note with a current-state summary,
  bump version 0.1 -> 0.2, date 2026-05-20 -> 2026-06-08.

- CHANGELOG.md: Documentation entry under [Unreleased] summarising
  the refresh.

No code, infrastructure, or runtime behaviour changes. The runbook
is the operator-facing source of truth for what's gating the next
Phase E owner action; the drift it had was making "what's still
open" harder to read at a glance.

Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 272 issues detected

Severity Count
🔴 Critical 15
🟠 High 134
🟡 Medium 123

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Stale AI session file -- delete",
    "type": "stale",
    "file": "GEMINI.md",
    "action": "delete",
    "rule_module": "root_hygiene",
    "severity": "medium"
  },
  {
    "reason": "Action  if: always()\n        uses: actions/upload-artifact@ea165f8 needs attention",
    "type": "unpinned_action",
    "file": "e2e.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in abi-drift.yml",
    "type": "missing_timeout_minutes",
    "file": "abi-drift.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql.yml",
    "type": "missing_timeout_minutes",
    "file": "codeql.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in container-publish.yml",
    "type": "missing_timeout_minutes",
    "file": "container-publish.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

@hyperpolymath hyperpolymath marked this pull request as ready for review June 8, 2026 21:05
@hyperpolymath hyperpolymath merged commit 84b00b3 into main Jun 8, 2026
23 checks passed
@hyperpolymath hyperpolymath deleted the phase-e/runbook-refresh-post-d-close branch June 8, 2026 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Item 7: Vector database + RAG cartridges wave (epic #87)

1 participant