feat(grading): hidden-criteria firewall + held-out blend as a substrate primitive by drewstone · Pull Request #283 · tangle-network/agent-eval

drewstone · 2026-06-24T18:08:26Z

What

Lifts the held-out / hidden-criteria grading FIREWALL out of the coding-benchmark example (in agent-runtime) into a domain-agnostic substrate primitive, so any domain — research, legal, tax, content — can grade an agent on hidden criteria it never saw, not just coding.

New module: src/hidden-criteria-grading.ts. Additive subpath on the root index — no breaking change, zero consumer updates needed.

The two reusable, domain-free pieces

The coding-LOCAL execution mechanism (node --test, TAP parsing) stays in the example. Only the general pieces are lifted, composed from existing types (JudgeScore) — nothing reinvented, no node/test/TS/exec/regex baked into the substrate.

1. Field routing by destination (the firewall as a type). A scenario tags each field by where it is allowed to flow:

`FieldDestination`	reaches the agent?	use
`agent-visible`	yes	the prompt / task
`develop-against`	yes (intentional, TDD)	a visible example/test
`grading-only`	never	the held-out suite / answer key
`judge-only`	never	rubric anchors / design intent

routeFields(routing, values) builds the routed field set from a domain's (field → destination) + (field → value) maps (fail-loud on a missing value).
assertNoHiddenLeak(fields, agentContext) is the firewall: throws ValidationError if any grading-only/judge-only value appears in the exact text that reaches the agent.
agentVisibleFields(...) returns the safe-to-render fields so a caller assembles the context from the routing instead of hand-picking.

2. Hidden-criteria grading. The domain supplies its own grader; the substrate supplies firewall enforcement + the composite:

HiddenCriteriaGrader<TArtifact, THidden> = (artifact, hiddenCriteria, signal?) => { passRate, total } — the one seam a non-coding domain implements. The coding node-test executor is ONE implementation a consumer plugs in.
gradeOnHidden({ artifact, hiddenCriteria, grader, firewall }) — re-asserts the firewall at grading time on the real agent context, then runs the grader.
hiddenGrade(passed, total) — the single-sourced honest-zero pass-rate rule (total === 0 → passRate 0, never a spurious pass).
blendHeldout(heldoutPassRate, judgeScore, weights?) — the composite (default 0.7 hidden correctness / 0.3 judge quality; weights renormalized; inputs clamped to [0,1]).
withHeldoutBlend(score, heldoutPassRate, weights?) — wraps a judge's score so the reported composite becomes the held-out-weighted blend (passes a failed verdict through untouched).

How a NON-coding domain plugs in

```ts
import { routeFields, gradeOnHidden, blendHeldout, hiddenGrade } from '@tangle-network/agent-eval'

// 1. Declare where each field flows
const fields = routeFields(
{ question: 'agent-visible', sample: 'develop-against', required: 'grading-only', rubric: 'judge-only' },
{ question, sample, required, rubric },
)

// 2. Bring YOUR OWN grader — no node/test here
const legalGrader = (artifact, hidden) =>
hiddenGrade(hidden.mustCite.filter(c => artifact.brief.includes(c)).length, hidden.mustCite.length)

// 3. Grade behind the firewall, blend with the judge
const heldout = await gradeOnHidden({ artifact, hiddenCriteria, grader: legalGrader, firewall: { fields, agentContext } })
const score = blendHeldout(heldout.passRate, judgeComposite)
```

Tests

20 focused tests on a non-coding (legal-brief) domain — proving the firewall has no domain coupling. They cover the two required proofs explicitly:

(a) assertNoHiddenLeak / gradeOnHidden reject a grading-only (and judge-only) field reaching the agent context.
(b) blendHeldout composes correctly (default + renormalized weights, clamping, zero-sum guard, withHeldoutBlend composite replacement + failed-verdict pass-through).

Verification

pnpm typecheck + pnpm build + pnpm test (251 files / 2581 tests) + pnpm lint + pnpm run verify:package — all green. Version trio bumped together: npm package.json, clients/python/pyproject.toml, __init__.py → 0.100.0.

Grain mirrors the recently-landed treatment-gate.ts: pure predicates + pure composition, fail-loud, parameterized matchers/graders, no domain literal in the module. Placed next to test-graded-scenario.ts / partition-held-out.ts (a scorecard/grading concept that makes sense without a running loop).

…te primitive Lift the held-out / hidden-criteria grading firewall out of the coding benchmark example into a domain-agnostic primitive so any domain (research, legal, tax, content) can grade an agent on criteria it never saw. Two reusable, domain-free pieces, composed from existing types (JudgeScore), no node/test/TS/exec baked in: - Field routing by destination: a scenario tags each field agent-visible / develop-against / grading-only / judge-only; routeFields + assertNoHiddenLeak enforce that a grading-only/judge-only value never reaches the agent context (fail-loud ValidationError). - Hidden-criteria grading: the domain supplies its own (artifact, hiddenCriteria) => { passRate, total } grader; the substrate provides firewall enforcement (gradeOnHidden) + the held-out-weighted composite (blendHeldout / withHeldoutBlend, default 0.7/0.3). The coding node-test executor stays in the example as ONE grader implementation. 20 focused tests on a non-coding (legal) domain prove the firewall rejects a leaked grading-only field and that blendHeldout composes correctly.

tangletools

✅ Auto-approved PR — `7e582fce`

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-24T18:08:33Z}

tangletools approved these changes Jun 24, 2026

View reviewed changes

drewstone merged commit aa066bd into main Jun 24, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(grading): hidden-criteria firewall + held-out blend as a substrate primitive#283

feat(grading): hidden-criteria firewall + held-out blend as a substrate primitive#283
drewstone merged 1 commit into
mainfrom
lift/hidden-criteria-firewall

drewstone commented Jun 24, 2026

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

drewstone commented Jun 24, 2026

What

The two reusable, domain-free pieces

How a NON-coding domain plugs in

Tests

Verification

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved PR — 7e582fce

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved PR — `7e582fce`