Review: session work 2026-05-22/23 — Phase 9 + metrics + integration UX#1
Open
cloudygeek wants to merge 8 commits into
Open
Review: session work 2026-05-22/23 — Phase 9 + metrics + integration UX#1cloudygeek wants to merge 8 commits into
cloudygeek wants to merge 8 commits into
Conversation
…tive prompts When Dredd says allow but Claude Code surfaces its own permission prompt (because the tool isn't in the user's local allowlist), the user clicking Yes is real consent we can learn from. Phase 9 captures it. Flow: /evaluate (allow) stashes a tacit-pending candidate; /notification arrives with a permission-style message; /track promotes the candidate with source="tacit" gated on a matching notification in the 60s window. No notification → no promotion (tool was auto-allowed; no consent given). Safety floor: tacit approvals feed the soft pattern-trust signal (judge prompt context) but are filtered out of the Stage 0.5 hard short-circuit count. Hard-override of Dredd policy denies still requires ≥2 explicit prior approvals — a Claude Code native click can never authorise rm -rf. Tests: 21 new assertions in test_phase9_tacit_approval.ts + 4 new cases in test_phase8b_pattern_trust.ts covering tacit filtering, mixed sources, and legacy missing-source rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sonnet input tokens dominate ~96% of Bedrock spend. Without per-call
cache-hit info we can't tell whether the cachePoint marker on the
system prompt is actually saving money or whether something is
invalidating the cache key every call.
bedrock-metrics.ts: in-process accumulator keyed by caller name
("judge", "classifier", "promptarmor", "preflight"). Records every
bedrockChat invocation's input/output/cache-read/cache-write tokens
and exposes a snapshot with derived hitRate, cachedTokenShare, and
estimated USD cost (Sonnet 4.6 EU rates).
bedrock-client.ts now takes an optional caller tag and fires the
accumulator. The three prod call sites (judge, classifier, preflight)
pass their tag.
pretool-interceptor.ts judge log line gains token + cache fields:
"judge=consistent(1886ms in=3500/cr=1700/cw=0 out=45)" — greppable
in CloudWatch for ad-hoc cost outlier hunts.
server-hook.ts: GET /api/bedrock-metrics returns the snapshot. CORS
matches the existing dashboard origin policy; no auth (no secrets,
just aggregate counters).
24 unit tests in test_bedrock_metrics.ts cover arithmetic + derived
fields including cache-write pricing at 125% of normal input.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The endpoint is non-sensitive in isolation but it reveals call rates, token volumes, and cost — operational observability that belongs behind the same admin gate as the logs endpoint. Without auth, anyone who guesses the path could scrape it. Auth flow: authenticateHookRequest (Bearer API key) → isAdminEmail (ADMIN_EMAILS in clerk-auth). Non-admin keys get 403; missing/invalid land on 401. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live /api/bedrock-metrics from prod showed 0% cache-hit rate on the judge after the first 3 calls. The cachePoint marker keys the cache on the exact system-prompt prefix; any per-call variation invalidates it. The only dynamic component was the Phase 8b priorApprovalsBlock (empty when no matches, populated with summaries / similarities / dates when matches fire — so the cache key flips on every transition). Fix: system prompt is now (UNTRUSTED_DIRECTIVE + baseSystemPrompt), bit-identical across calls. The priorApprovalsBlock moves to the head of the user content with a `server_trusted="true"` attribute so the judge still recognises it as authoritative context rather than adversarial agent / tool-output text covered by UNTRUSTED_DIRECTIVE. Expected effect once deployed: judge avgInputTokens stays the same but cachedTokenShare climbs into the 0.4-0.6 range (system block is ~1700 tokens of the typical ~3500-4000 input), giving us roughly 40-50% input-cost savings on steady-state judge traffic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nostic Live /api/bedrock-metrics after the 0.1.417 rollout shows 0 cache writes and 0 cache reads even on the first call after a container restart — the first call SHOULD always write the cache. Static system prompt is ~6740 chars / ~1700 tokens (B7.1), comfortably above Sonnet 4.6's 1024-token cache minimum, so the prefix length isn't the issue. Diagnostic: log model ID, system prompt char count, inputTokens, outputTokens, cacheRead/cacheWrite (or "n/a" if missing), and the list of keys present on the usage object for the first 10 calls per process. Lets us see whether Bedrock is returning the cache fields at all on the cross-region inference profile (eu.anthropic.claude-sonnet-4-6). Capped at 10 calls per process so CloudWatch doesn't get spammed once we know what's going on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Direct boto3 probe of eu.anthropic.claude-sonnet-4-6 shows the cache point engages around 2048 tokens, not the 1024 documented in AWS's prompt-caching table: 1,619 tokens — 0 cache writes 1,994 tokens — 0 cache writes 2,108 tokens — 2,096 written, 2,096 read on the next call Our B7.1 system prompt is ~1,766 tokens — under the real threshold — so the cachePoint marker is a silent no-op today. Single-user prod traffic isn't material; deferred until we scale. Documents the finding in CLAUDE.md "Cost & cache-engagement notes" with the break-even math (300-token static padding pays for itself in <1 cache hit per 5-min window) and leaves a pointer in bedrock-client.ts so the next person touching prompt caching sees the constraint inline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The dashboard's Integration tab and the bundle's README both jumped from "install hook" → "wire up settings" → "verify" with no mention of the Bearer API key the hook server requires (DREDD_AUTH_MODE=required is the default in prod). A new user following the UI got 401s on every /intent and /evaluate, the hook fail-soft'd, and Claude Code proceeded without Dredd — silently broken because the suggested verify step hit /api/health which answers without auth. Changes: - API Keys tab: plaintext banner now ships a ready-to-run shell snippet that drops the key into ~/.claude/dredd/api-key (chmod 600), the location the hook script reads from by default. textContent prevents HTML injection. - Integration tab: new step 1 "Generate & install your API key" with a link into the API Keys tab. Existing steps renumbered to 2/3/4. The verify step now hits /api/auth-check (see below) instead of /api/health so a missing key actually fails the check. - Bundle README.md: mirrors the new step 1 + verify against /api/auth-check. - New endpoint GET /api/auth-check on the hook role. Runs authenticateHookRequest so missing/invalid keys return 401 and valid keys return 200 + ownerSub/ownerEmail/keyType. CORS so the dashboard can poll it too. Read-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New entry path on the Integration tab: download a single prompt file,
pipe it into a Claude Code session, and Claude walks the user through
the install one step at a time — describing each action and asking
permission before running it.
The prompt is scoped narrowly on purpose. Claude is instructed to:
- Print a numbered summary and STOP for confirmation before touching
any file.
- Only write to ~/.claude/dredd/ and ~/.claude/settings.json (or
<project>/.claude/settings.json).
- Never touch .bashrc, .zshrc, .profile, env files, shell aliases,
or any path outside the allow-list.
- Never exfiltrate the API key.
- Stop and report any unexpected failure rather than "fix" it by
editing other files.
Three surfaces:
- integration-bundle.ts: renderInstallPrompt() now exported; bundle
zip ships claude-install-prompt.txt alongside the existing 3 files.
- server-dashboard.ts: new GET /api/install-prompt returns just the
prompt as text/plain (Clerk-gated, same auth as /api/integration-bundle).
- dashboard.html: Integration tab gains an "Or: let Claude install
Dredd for you" panel at the top with download button + the one-line
`claude < ~/.claude/dredd/claude-install-prompt.txt` snippet. The
existing manual-install flow is preserved unchanged below it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
Code reviewFound 1 issue:
JudgeAIDredd/src/intent-judge.ts Lines 25 to 30 in 76e4f37 JudgeAIDredd/src/intent-judge.ts Lines 555 to 582 in 76e4f37 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
cloudygeek
added a commit
that referenced
this pull request
May 31, 2026
Weekly deny review (week of 2026-05-24, prod jaid-sessions): 117 denies, dominated by two false-positive classes. 1. AWS identity introspection (#1 judge-deny FP, 20/117 across 10 sessions): - allowlist `aws sts get-caller-identity` + `aws configure list-profiles` (optional leading AWS_* env prefix). Reads no secrets — the AWS `whoami`. - stop treating `2>&1`/`>&2`/fd-dups and `/dev/null` sinks as "file write via redirection" in sanitizeForMatching (they aren't), so the allowlist bites on the real `... 2>&1 | head` forms. Real file writes (`> out.json`) and non-null device writes (`> /dev/sda`) still review/deny. 2. Destructive rm on build artifacts (24/30 policy denies): sandbox-relative carve-out. Forward the live PreToolUse `cwd` from the hook -> /evaluate -> interceptor -> policy, so relative rm targets resolve against where the command runs: - rm -f (non-recursive) of bounded / in-project / /tmp targets -> allow - rm -rf (recursive) in-project -> review (judge/user adjudicates .venv, __pycache__) instead of a hard block; /tmp -> allow; out-of-sandbox, $var/glob/~/.., or the project root itself -> deny - chained `cd && rm` tracks the cwd so targets resolve against the cd'd dir With no cwd/projectRoot threaded, behaviour is unchanged (legacy carve-outs). Tests: test_policy_allowlist.ts (+20 -> 59), test_policy_rm_sandbox.ts (25). All policy suites + tsc --noEmit green. Not deployed — hook + server images need a rebuild + redeploy to take effect. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundles 8 commits from the 2026-05-22/23 session for a tight code review.
Branched off
b36d8222(the last commit onintent-history-activebefore thissession started). When the review is done, the work can be fast-forwarded back
into
intent-history-active.Commits in order
e7335b16feat(approvals): Phase 9 — tacit approval capture from Claude Code-native prompts976ceba5feat(metrics): per-caller Bedrock cost + cache-hit visibility3d086056fix(metrics): gate /api/bedrock-metrics behind admin API keybfc79c86perf(judge): move dynamic priorApprovalsBlock out of system prompt50405f81chore(metrics): log first 10 Bedrock calls per process for cache diagnostica0ce07f4docs: note Sonnet 4.6 prompt-cache threshold empirically ~2048 tokens342fae17fix(dashboard,bundle): wire API-key step into the integration flow76e4f37cfeat(integration): "let Claude install Dredd" prompt + downloadableTest plan
npx tsc --noEmitclean (run after each commit)npx tsx hooks/tests/test_phase8a_approval_embedding.ts— 8 passnpx tsx hooks/tests/test_phase8b_pattern_trust.ts— 14 pass (incl. 4 new Phase 9 cases)npx tsx hooks/tests/test_phase9_tacit_approval.ts— 21 passnpx tsx hooks/tests/test_bedrock_metrics.ts— 24 pass/api/auth-checkreturns 401/200 correctly,/api/install-promptis Clerk-gated,/api/bedrock-metricsrequires adminKnown caveats (separately filed)
eu.anthropic.claude-sonnet-4-6(docs say 1024). Documented in CLAUDE.md "Cost & cache-engagement notes"; padding fix deferred until scaling./api/feed,/api/mode,/api/session-mode,/api/session-modes,/api/data-status,/session/:id,/register) — out of scope for this PR; will file separately.🤖 Generated with Claude Code