Skip to content

feat(k8s): NetworkPolicy restricting BoJ ingress to HCG pod (#135)#173

Merged
hyperpolymath merged 1 commit into
mainfrom
feat/k8s-networkpolicy-135
Jun 1, 2026
Merged

feat(k8s): NetworkPolicy restricting BoJ ingress to HCG pod (#135)#173
hyperpolymath merged 1 commit into
mainfrom
feat/k8s-networkpolicy-135

Conversation

@hyperpolymath

Copy link
Copy Markdown
Owner

Summary

Why

Phase E acceptance is already satisfied by the three loopback layers above. ClusterIP makes BoJ unreachable from outside the cluster, but does NOT prevent a compromised neighbour pod that knows the ClusterIP from talking to BoJ. NetworkPolicy closes that gap. It also acts as a safety net if a future overlay re-introduces type: NodePort or type: LoadBalancer — the pod-network restriction still holds.

Per ADR-0004 §1 invariant 4 ("not externally routable"), three independent layers must now be violated before BoJ's back-side surface is reachable from anywhere other than HCG.

Test plan

  • YAML parses (yaml.safe_load_all confirmed; 1 NetworkPolicy doc, name boj-server-ingress).
  • kubectl apply --dry-run=client -f k8s/networkpolicy.yaml lints clean (CI runner; local lacked kubectl).
  • Staging smoke test (post-merge, before relying on this layer):
    • With the policy applied, curl from another pod (not labelled app: http-capability-gateway) to BoJ's ClusterIP times out.
    • From an HCG-labelled pod, curl succeeds.
  • Verify CNI plugin enforces NetworkPolicy (Calico/Cilium/Weave-NetPol). Flannel-no-VXLAN is silent no-op — documented in header.

Closes #135
Refs hyperpolymath/standards#100, hyperpolymath/standards#91.

🤖 Generated with Claude Code

Optional defence-in-depth layer on top of the existing three (Cowboy
loopback bind, Zig adapter APP_HOST, Service ClusterIP). Restricts pod
ingress to peers labelled `app: http-capability-gateway`, so a
compromised neighbour pod or operator misconfiguration that
re-introduces an external Service type still cannot reach BoJ from
elsewhere in the cluster.

Header documents:
  - CNI requirement (Calico/Cilium/Weave-NetPol; flannel-no-VXLAN no-op)
  - Override pattern (kustomize/helm overlay for non-HCG-fronted)
  - Kubelet health-probe caveat (verify in staging)

Ports 7700–7703 declared forward-compatibly so future gRPC/GraphQL/SSE
adapters need no NetworkPolicy edit.

CHANGELOG entry under `### Added` per repo style.

Closes #135
Refs hyperpolymath/standards#100, hyperpolymath/standards#91.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath enabled auto-merge (squash) June 1, 2026 11:14
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 217 issues detected

Severity Count
🔴 Critical 16
🟠 High 127
🟡 Medium 74

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Stale AI session file -- delete",
    "type": "stale",
    "file": "GEMINI.md",
    "action": "delete",
    "rule_module": "root_hygiene",
    "severity": "medium"
  },
  {
    "reason": "Action    if: always()\n        uses: actions/upload-artifact@ea165 needs attention",
    "type": "unpinned_action",
    "file": "e2e.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in abi-drift.yml",
    "type": "missing_timeout_minutes",
    "file": "abi-drift.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql.yml",
    "type": "missing_timeout_minutes",
    "file": "codeql.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in container-publish.yml",
    "type": "missing_timeout_minutes",
    "file": "container-publish.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "missing_timeout_minutes",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

@hyperpolymath hyperpolymath merged commit 8e2345f into main Jun 1, 2026
22 of 23 checks passed
@hyperpolymath hyperpolymath deleted the feat/k8s-networkpolicy-135 branch June 1, 2026 11:31
hyperpolymath added a commit that referenced this pull request Jun 8, 2026
…ards#91 / #100)

Refreshes docs/integration/hcg-tier2-rollout-runbook.md from v0.1
(draft, 2026-05-20, pre Phase-D) to v0.2 reflecting the current
state of the single-lane channel rooted at standards#91:

- §1.1 Phase D deliverables: tick D-1..D-3 + D-4 bootstrap with
  http-capability-gateway PR refs (#12 / #14 / #22 / #26 / #30) and
  the boj-server D-1 load-profile (#168) that joint-closed standards#99
  on 2026-06-01. The one remaining open item is the owner-driven
  perf-rebaseline workflow dispatch + `_status: scaffold-placeholder
  -> active` flip; called out explicitly rather than left as a stale
  unchecked checkbox.

- §1.4 BoJ-side prereqs: tick the three loopback-bind layers
  (#130 / #131 / #132), the Phase C TrustPolicy clause (#106), the
  NetworkPolicy (#173), and the SSE-route policy coverage (#165).
  The Trustfile `tier_2_gateway.status: PENDING` line stays
  intentionally unchecked - it's the §6.4 last-action target.

- §1.5 Gateway-side prereqs: tick the new
  `container/gateway-deploy.k9.ncl` from http-capability-gateway#38
  (2026-06-03), record what stays PLACEHOLDER until cerro-torre
  signing runs, and expand the smoke-test entry with the concrete
  allow/deny sequence boj-server#165 deferred.

- Header banner: replace the stale "Phase D has merged the scaffold
  only" Phase-D-dependency note with a current-state summary,
  bump version 0.1 -> 0.2, date 2026-05-20 -> 2026-06-08.

- CHANGELOG.md: Documentation entry under [Unreleased] summarising
  the refresh.

No code, infrastructure, or runtime behaviour changes. The runbook
is the operator-facing source of truth for what's gating the next
Phase E owner action; the drift it had was making "what's still
open" harder to read at a glance.

Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request Jun 9, 2026
)

## Summary

Lands `config/gateway-policy-boj.yaml` — the **live** Verb Governance
Spec
the HCG tier-2 gateway loads via `POLICY_PATH` in staging (§2.1) and
production (§3.1) per the rollout runbook. The Phase A worked example
(`config/gateway-policy-boj-example.yaml`) is retained as the
documentation
artefact; the live file is now the operational one. Closes the
example→live
promotion item on the Phase E §1.5 checklist.

Single-lane HCG tier-2 channel (`standards#91`). Phase A (#96), B (#97),
C (#98), D (#99) are joint-closed; Phase E (`standards#100`) is the
active
phase, with multiple artefacts gating closure (§6.4 Trustfile flip is
the
last). This PR lands one tractable artefact; staging soak (§2),
production
traffic split (§3) and the §6.4 flip remain owner-driven.

## What this PR lands

- **`config/gateway-policy-boj.yaml`** — live policy file.
Content-identical
to `gateway-policy-boj-example.yaml` at promotion time. Header rewritten
  to reflect its live-file role (operational artefact, not pedagogical),
with `DEFAULT-DENY INVARIANT` reframed from "Phase A check" to
"permanent
invariant — must hold for every future gateway release". DSL v1
conformance
preserved; all 28 routes (`global_verbs: [GET, POST]`; per-route
`verbs`,
  `exposure`, `name`, `narrative`; `stealth_profile` on internal routes;
top-level `stealth: { enabled: true, status_code: 404 }`) carried
forward
  unchanged.
- **Runbook §1.5** — flips the trailing "still to be promoted from this
example before §3.1" note (on the existing `[x]` example-in-place line)
  to a discrete `[x]` item recording the live file's existence and the
divergence policy ("future BoJ-surface evolution lands in the live file;
  the example remains as the worked-example artefact").
- **Runbook §2.1 step 2** — switches staging `POLICY_PATH` from the
example
to the live file so staging exercises the same artefact that production
  will. Production §3.1 (which inherits §2.1's environment with the
  traffic-shift mechanism overlaid) needs no change.
- **Runbook header** — version 0.2 → 0.3; status line updated to
acknowledge
  the live-policy promotion.

## What this PR deliberately does NOT do

- **Close `standards#100`.** Per runbook §6.5 the joint-close happens
after
  the §6.4 Trustfile flip (`tier_2_gateway.status: PENDING → DEPLOYED`),
which itself follows the §3.3 100% production-soak window. Using `Refs`
  not `Closes` to match the established Phase E pattern (PRs #38, and
Phase D PRs #14, #22, #26, #30 — all `Refs`'d their phase issue and the
  owner joint-closed the issue once the final artefact landed). This
  deliberately diverges from the dispatch brief's literal "Closes
  hyperpolymath/standards#<phase-issue-number>" line in favour of the
canonical runbook §6.5 close-out discipline that the brief itself points
  to as the source of truth ("using the canonical sources"). The owner
  remains the sole closer of `standards#100`.
- **Touch the HCG deploy spec.** `container/gateway-deploy.k9.ncl` in
`hyperpolymath/http-capability-gateway` (PR #38) reads `POLICY_PATH` at
deploy time from the env, so the live-file cut-over is a runbook +
config
artefact change on the BoJ side, not a deploy-spec change on the gateway
  side. No companion PR on the gateway repo.
- **Diverge the live file from the example.** At promotion the two files
are content-identical. Future divergence is intentional and the live
file
  is authoritative; the example may be intentionally simpler.
- **Trigger any deploy.** No traffic shift, no staging cut-over, no §6.4
  flip happens at merge time. This is a static artefact landing.
- **Update the deploy spec's `POLICY_PATH` default.** The deploy spec
carries env-var declarations; the live-file path is operator-supplied at
  deploy time.

## Verification

- [x] DSL v1 conformance: `dsl_version: "1"`; `governance.global_verbs`
is
      `[GET, POST]`; every route has a non-empty `verbs`; `exposure ∈
      {public, authenticated, internal}`; `stealth.enabled` boolean,
      `stealth.status_code: 404` in 100..599.
- [x] All 28 example routes preserved unchanged in the live file (route
      count, `name`s, paths, verbs, exposures, narratives).
- [x] SPDX header `MPL-2.0` matches repo convention (config/, docs/).
- [x] Runbook §1.5 and §2.1 cross-references to
`gateway-policy-boj.yaml`
      and `gateway-policy-boj-example.yaml` resolve.
- [ ] Manual: `mix gateway.validate config/gateway-policy-boj.yaml`
      (gateway-side; can be run by the operator before §2.1 stand-up —
      see runbook §1.5 last open item, smoke-test).

## Channel position

```
standards#91 (parent, open)
├── #96 Phase A — closed (boj-server: contract + policy-authoring + example; gateway: -)
├── #97 Phase B — closed (gateway#10: mTLS primary path)
├── #98 Phase C — closed (gateway#11: strip; boj-server#106: TrustPolicy clause)
├── #99 Phase D — closed (boj-server#168 on 2026-06-01; gateway#12/#14/#22/#26/#30)
└── #100 Phase E — IN PROGRESS
     ├── E5 runbook draft — boj-server#128 (landed; rehearsal pending)
     ├── E1 loopback prereqs — boj-server#130/#131/#132/#165/#173 (landed)
     ├── E1 deploy spec — http-capability-gateway#38 (landed)
     ├── E1 live policy promotion — THIS PR (in review)
     ├── E1 .ctp signing — owner follow-up
     ├── E2 staging cut-over — owner follow-up
     ├── E3 telemetry verification — owner follow-up
     ├── E4 production rollout — owner follow-up
     └── §6.4 Trustfile flip + §6.5 joint-close — owner-only
```

Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---
_Generated by [Claude
Code](https://claude.ai/code/session_012FiVM8R8FWBgBsUGpnXTZM)_

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

k8s NetworkPolicy: restrict BoJ pod ingress to HCG pod only (HCG tier-2 defence-in-depth)

1 participant