docs: post-POWER-16 go-live plan + verifiably-smarter-and-cheaper thesis by drewstone · Pull Request #278 · tangle-network/agent-runtime

drewstone · 2026-06-13T20:58:20Z

What

The rederivation Drew asked for after POWER-16 retracted the depth>breadth keystone (+16.4pp n=16 → +4.1pp CI[−1.6,+10.2] tie at n=72). Two docs:

docs/go-live-plan.md — the full plan (5-lens workflow → synthesis → adversarial hardening, every code claim verified): the honest thesis, SDK roadmap deltas, the off/eco/standard/thorough/max tier table, the 2-week slice (Mode-0 + Observe on gtm), the do-not-claim firewall, the fix: persist final runtime stream failures #1 build (shipped in feat(intelligence): Observe + Mode-0 Intelligence SDK wrapper + effort/billing boundary #277) + fix: persist final runtime stream failures #1 experiment (E4: does the cost flywheel compound).
docs/intelligence-sdk.md — the contract gains the honest-claim block, Mode 0 (intelligence-off), and the Verified→Gated rename.

The thesis

Verifiably smarter and cheaper. POWER-16 killed within-run-cleverness-beats-blind-at-equal-compute, NOT the ability to make agents smarter. Smarter comes from gated search (spend compute, certify the winner) + cheap serving of the certified artifact (the cost flywheel, −12 to −31%). 'Smarter' is allowed with its gate, CI, and n attached — the gate is the moat (everyone else sells lucky streaks), not a ban on the word. Proven now: a certified program transfers +31/+36pp at lower cost (a step). Prove next: that it compounds (E4).

…n thesis, Mode 0 (intelligence-off), do-not-claim firewall, Verified→Gated The depth>breadth keystone retracted to a tie at n=72; the SDK sells cost + verification + transfer, not quality improvement. Adds the honest-claim firewall (normative do-not-claim block), the intelligence-OFF billing floor (sandbox-stream, inference+compute only, billing-line-on-the-spawn-line), and renames Verified PRs → Gated PRs. Full plan: docs/go-live-plan.md (5-lens rederivation, code-verified).

…not a ban on 'smarter') Corrects the over-rotation: POWER-16 killed within-run-cleverness-beats-blind-at- equal-compute, NOT the ability to make agents smarter. Smarter comes from gated search (spend compute, certify the winner) + cheap serving of the certified artifact. 'Smarter' is allowed with its gate+CI+n attached — that's the differentiation. Aligns the billing boundary to usage-classification + spawn-gating (not budget-pool surgery).

Replaces the minimal 4-sentence baseline in trata-gate.mts and trata-gepa.mts with the surface found by GEPA across 9 runs (+8.6pp holdout on deepseek-v4-flash, confirmed twice independently). Adds five sentences: labeled-section structure, named-peer benchmarking with EV/EBITDA/P/E/margin metrics, verbatim guidance citation, IRR computation with arithmetic, and explicit SYSTEM_PROMPT env override for future experimentation.

tangletools

✅ Auto-approved PR — `6488bac8`

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-13T20:58:27Z}

drewstone added 3 commits June 13, 2026 12:04

tangletools approved these changes Jun 13, 2026

View reviewed changes

drewstone merged commit 9168002 into main Jun 13, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: post-POWER-16 go-live plan + verifiably-smarter-and-cheaper thesis#278

docs: post-POWER-16 go-live plan + verifiably-smarter-and-cheaper thesis#278
drewstone merged 3 commits into
mainfrom
docs/go-live-rederivation

drewstone commented Jun 13, 2026

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

drewstone commented Jun 13, 2026

What

The thesis

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved PR — 6488bac8

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved PR — `6488bac8`