Skip to content
Open
14 changes: 14 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,20 @@ The catastrophic case (session-locked after N consecutive hijack verdicts) hard-

Configurable in tracker: `<0.2` on-task, `0.2–0.3` scope-creep (inject reminder), `0.3–0.5` drifting (escalate to judge), `>0.5` hijacked (block).

## Cost & cache-engagement notes

`GET /api/bedrock-metrics` (admin-only Bearer API key) returns in-process per-caller stats: calls, cacheHits, cachedTokenShare, avgInputTokens, estimatedCostUsd. The judge log line in `pretool-interceptor.ts` also carries `in=N/cr=N/cw=N out=N` per call for ad-hoc CloudWatch greps.

**Known issue (2026-05-21, deferred — apply when scaling): prompt cache silently disabled on `eu.anthropic.claude-sonnet-4-6`.** The AWS docs say Sonnet 4.6's minimum cacheable prefix is 1,024 tokens. Empirically on the EU cross-region inference profile the cutoff is closer to **~2,048 tokens**. Our B7.1 system prompt is ~1,766 tokens — under the real threshold — so Bedrock silently skips the cache point and the entire system prompt is billed as uncached input on every call. Confirmed via direct boto3 test: 1,994-token prefix → 0 cache writes; 2,108-token prefix → 2,096 written then read on the next call.

When we scale beyond a single user it'll be worth fixing. The cheapest fix is to add ~300 tokens of static "operating notes / reference examples" at the END of the B7.1 system prompt (`intent-judge.ts` HARDENED_V2_SYSTEM_PROMPT). Padding must be byte-identical across calls to keep the cache key stable. Cost math on the deferred fix:

- One-time write per 5-minute window: 300 padding tokens × $4.125/M = $0.0012
- Savings per cache read: ~2,200 cached tokens × ($3.30 − $0.33)/M = $0.0065
- Break-even at <1 cache hit per write window; with the current judge rate the cache discount drops Sonnet input cost by roughly 40–50% on steady-state traffic.

If we ever cut over to a different model ID (e.g. `anthropic.claude-sonnet-4-6` without the `eu.` prefix, or Claude Sonnet 4.7), re-run the threshold probe via the boto3 snippet in commit history before relying on the documented minimum.

## User permissions — Claude Code allow/deny/ask integration

Two independent features that both touch Claude Code's `permissions.{allow,deny,ask}` configuration. Both ship in the hook + server; both are env-gated.
Expand Down
144 changes: 144 additions & 0 deletions hooks/tests/test_bedrock_metrics.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
/**
* bedrock-metrics — accumulator unit tests.
*
* Validates per-caller stat math and the derived cost / cache-hit rate
* fields without hitting Bedrock. The point of the module is to give
* us operator visibility on real cache-hit rates — these tests just
* make sure the arithmetic stays correct as we add fields.
*
* Run: npx tsx hooks/tests/test_bedrock_metrics.ts
*/

import {
recordBedrockCall,
getBedrockMetrics,
resetBedrockMetrics,
} from "../../src/bedrock-metrics.js";

const c = { green: "\x1b[32m", red: "\x1b[31m", off: "\x1b[0m", dim: "\x1b[2m" };
let PASS = 0;
let FAIL = 0;
const pass = (m: string) => { console.log(` ${c.green}✓${c.off} ${m}`); PASS++; };
const fail = (m: string) => { console.log(` ${c.red}✗${c.off} ${m}`); FAIL++; };
const section = (h: string) => console.log(`\n${c.dim}---${c.off} ${h} ${c.dim}---${c.off}`);

function approx(a: number, b: number, eps = 1e-6): boolean {
return Math.abs(a - b) < eps;
}

function main() {
section("Empty state");
resetBedrockMetrics();
const empty = getBedrockMetrics();
empty.totals.calls === 0 ? pass("totals.calls = 0") : fail(`got ${empty.totals.calls}`);
Object.keys(empty.perCaller).length === 0 ? pass("perCaller empty") : fail("expected empty perCaller");

section("Single call — no cache");
resetBedrockMetrics();
recordBedrockCall("judge", { inputTokens: 4000, outputTokens: 50, durationMs: 1800 });
const s1 = getBedrockMetrics();
s1.perCaller.judge.calls === 1 ? pass("judge calls = 1") : fail(`got ${s1.perCaller.judge.calls}`);
s1.perCaller.judge.cacheHits === 0 ? pass("cacheHits = 0 on no-cache call") : fail("unexpected hits");
s1.perCaller.judge.totalInputTokens === 4000 ? pass("input tokens summed") : fail("input mismatch");
approx(s1.perCaller.judge.cacheHitRate, 0) ? pass("hitRate = 0") : fail(`got ${s1.perCaller.judge.cacheHitRate}`);
approx(s1.perCaller.judge.cachedTokenShare, 0) ? pass("cachedTokenShare = 0") : fail("unexpected share");

// Cost = 4000 input × $3.30/M + 50 output × $16.50/M
// = 0.0132 + 0.000825 = 0.014025
const expected1 = (4000 / 1e6) * 3.30 + (50 / 1e6) * 16.50;
approx(s1.perCaller.judge.estimatedCostUsd, expected1, 1e-6)
? pass(`cost ${s1.perCaller.judge.estimatedCostUsd.toFixed(6)} matches uncached calc`)
: fail(`got ${s1.perCaller.judge.estimatedCostUsd} expected ${expected1}`);

section("Cache hit — discount applied");
resetBedrockMetrics();
recordBedrockCall("judge", {
inputTokens: 4000,
outputTokens: 50,
cacheReadInputTokens: 3500, // 87.5% cached
durationMs: 1800,
});
const s2 = getBedrockMetrics();
s2.perCaller.judge.cacheHits === 1 ? pass("cache hit counted") : fail(`got ${s2.perCaller.judge.cacheHits}`);
approx(s2.perCaller.judge.cacheHitRate, 1) ? pass("hitRate = 1.0") : fail("hitRate wrong");
approx(s2.perCaller.judge.cachedTokenShare, 3500 / 4000)
? pass(`cachedTokenShare = ${s2.perCaller.judge.cachedTokenShare.toFixed(3)}`)
: fail(`got ${s2.perCaller.judge.cachedTokenShare}`);

// Cost = (4000 - 3500) × $3.30/M + 3500 × $0.33/M + 50 × $16.50/M
// = 500 × 0.0000033 + 3500 × 0.00000033 + 50 × 0.0000165
// = 0.00165 + 0.001155 + 0.000825 = 0.00363
const expected2 =
((4000 - 3500) / 1e6) * 3.30 +
(3500 / 1e6) * 0.33 +
(50 / 1e6) * 16.50;
approx(s2.perCaller.judge.estimatedCostUsd, expected2, 1e-6)
? pass(`cost ${s2.perCaller.judge.estimatedCostUsd.toFixed(6)} matches cached calc`)
: fail(`got ${s2.perCaller.judge.estimatedCostUsd} expected ${expected2}`);

// Cache should make this dramatically cheaper than the uncached case.
s2.perCaller.judge.estimatedCostUsd < expected1 * 0.5
? pass("cached cost < 50% of uncached")
: fail(`cached cost ${s2.perCaller.judge.estimatedCostUsd} not much lower than uncached ${expected1}`);

section("Cache write — first call in window");
resetBedrockMetrics();
recordBedrockCall("judge", {
inputTokens: 4000,
outputTokens: 50,
cacheWriteInputTokens: 1760,
durationMs: 1800,
});
const s3 = getBedrockMetrics();
s3.perCaller.judge.cacheWrites === 1 ? pass("cache write counted") : fail("expected write");
// 1760 × $4.125/M for the write, rest at normal input
const expected3 =
((4000 - 1760) / 1e6) * 3.30 +
(1760 / 1e6) * 4.125 +
(50 / 1e6) * 16.50;
approx(s3.perCaller.judge.estimatedCostUsd, expected3, 1e-6)
? pass(`cache-write cost ${s3.perCaller.judge.estimatedCostUsd.toFixed(6)} matches`)
: fail(`got ${s3.perCaller.judge.estimatedCostUsd} expected ${expected3}`);

section("Multiple callers — totals roll up");
resetBedrockMetrics();
recordBedrockCall("judge", { inputTokens: 4000, outputTokens: 50, cacheReadInputTokens: 3500 });
recordBedrockCall("judge", { inputTokens: 4200, outputTokens: 55, cacheReadInputTokens: 3500 });
recordBedrockCall("classifier", { inputTokens: 1900, outputTokens: 80, cacheWriteInputTokens: 1200 });
const s4 = getBedrockMetrics();
s4.totals.calls === 3 ? pass("totals.calls = 3") : fail(`got ${s4.totals.calls}`);
s4.perCaller.judge.calls === 2 ? pass("judge calls = 2") : fail("judge count");
s4.perCaller.classifier.calls === 1 ? pass("classifier calls = 1") : fail("classifier count");
s4.totals.totalInputTokens === (4000 + 4200 + 1900)
? pass("totals.totalInputTokens summed")
: fail(`got ${s4.totals.totalInputTokens}`);
approx(s4.perCaller.judge.cacheHitRate, 1)
? pass("judge hitRate = 1")
: fail("judge hitRate");
approx(s4.perCaller.classifier.cacheHitRate, 0)
? pass("classifier hitRate = 0 (write, not read)")
: fail("classifier hitRate");

section("Unknown caller folds into 'unknown'");
resetBedrockMetrics();
recordBedrockCall("", { inputTokens: 100 });
recordBedrockCall("unknown", { inputTokens: 200 });
const s5 = getBedrockMetrics();
s5.perCaller.unknown?.calls === 2
? pass("empty caller folded into 'unknown'")
: fail(`unknown calls = ${s5.perCaller.unknown?.calls}`);

section("Snapshot includes uptime + timestamp");
const s6 = getBedrockMetrics();
typeof s6.snapshotAt === "string" && s6.snapshotAt.length > 0
? pass(`snapshotAt = ${s6.snapshotAt}`)
: fail("snapshotAt missing");
typeof s6.processUptimeSec === "number" && s6.processUptimeSec >= 0
? pass(`processUptimeSec = ${s6.processUptimeSec}`)
: fail("uptime missing");

console.log(`\n ${PASS} passed, ${FAIL} failed`);
process.exit(FAIL === 0 ? 0 : 1);
}

main();
112 changes: 112 additions & 0 deletions hooks/tests/test_phase8b_pattern_trust.ts
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,118 @@ async function main() {
result.patternTrust === undefined ? pass("[]-vector approvals ignored") : fail("[]-vector counted somehow");
}

// -------------------------------------------------------------------
// Phase 9 — only explicit approvals count for the hard short-circuit.
section("Phase 9: tacit approvals don't drive HARD short-circuit");

{
const approvals = [
makeApproval({ summary: "tacit 1", inputEmbedding: ALIGNED, source: "tacit" }),
makeApproval({ summary: "tacit 2", inputEmbedding: ALMOST, source: "tacit" }),
];
const result = await interceptor.evaluate(
"s-test",
"Bash",
{ command: "ALIGNED something" },
undefined,
"/proj/foo",
"interactive",
undefined,
false,
undefined,
null,
approvals,
true,
);
result.stage !== "pattern-trust-allow"
? pass(`2 tacit matches: no hard short-circuit (stage=${result.stage})`)
: fail("tacit-only matches incorrectly short-circuited");
// Soft signal still feeds the judge prompt (softContext); the
// result.patternTrust annotation is only set on the hard path
// (see InterceptionResult.patternTrust JSDoc), so we don't assert
// on it here.
}

section("Phase 9: 1 explicit + 1 tacit: still no HARD short-circuit");

{
const approvals = [
makeApproval({ summary: "explicit", inputEmbedding: ALIGNED, source: "explicit" }),
makeApproval({ summary: "tacit", inputEmbedding: ALMOST, source: "tacit" }),
];
const result = await interceptor.evaluate(
"s-test",
"Bash",
{ command: "ALIGNED rm" },
undefined,
"/proj/foo",
"interactive",
undefined,
false,
undefined,
null,
approvals,
true,
);
result.stage !== "pattern-trust-allow"
? pass(`mixed 1-explicit-1-tacit: no hard short-circuit (stage=${result.stage})`)
: fail("mixed counted as 2 explicit");
}

section("Phase 9: 2 explicit (mixed with tacit) short-circuits");

{
const approvals = [
makeApproval({ summary: "explicit 1", inputEmbedding: ALIGNED, source: "explicit" }),
makeApproval({ summary: "explicit 2", inputEmbedding: ALMOST, source: "explicit" }),
makeApproval({ summary: "tacit noise", inputEmbedding: ALIGNED, source: "tacit" }),
];
const result = await interceptor.evaluate(
"s-test",
"Bash",
{ command: "ALIGNED rm -rf" },
undefined,
"/proj/foo",
"interactive",
undefined,
false,
undefined,
null,
approvals,
true,
);
result.stage === "pattern-trust-allow"
? pass(`2 explicit + 1 tacit short-circuited (stage=${result.stage})`)
: fail(`expected pattern-trust-allow, got ${result.stage}`);
}

section("Phase 9: legacy approvals (missing source) treated as explicit");

{
// Simulate legacy rows that pre-date the source field by deleting it.
const a1 = makeApproval({ summary: "legacy 1", inputEmbedding: ALIGNED }) as any;
const a2 = makeApproval({ summary: "legacy 2", inputEmbedding: ALMOST }) as any;
delete a1.source;
delete a2.source;
const result = await interceptor.evaluate(
"s-test",
"Bash",
{ command: "ALIGNED rm -rf" },
undefined,
"/proj/foo",
"interactive",
undefined,
false,
undefined,
null,
[a1, a2],
true,
);
result.stage === "pattern-trust-allow"
? pass(`legacy rows count as explicit (stage=${result.stage})`)
: fail(`expected pattern-trust-allow, got ${result.stage}`);
}

} finally {
stub.close();
}
Expand Down
Loading