feat(bench): wire autosize end-to-end + H22 comparator-parity#127
Merged
Conversation
End-to-end autosize harness wiring (pretable + ag-grid + mui; tanstack unsupported), with H22 comparator-parity hypothesis evaluator reusing the min-repeat gate from PR #125, and a full B2 matrix re-run with autosize included. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Six-task plan for wiring autosize through the bench harness end-to-end, adding evaluateH22 with the min-repeat gate, and re-running the B2 matrix with autosize included. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds "autosize" to the bench-runner supportedScripts allowlist (gated to S2 and to pretable | ag-grid | mui — tanstack remains unsupported per the B2 spec), to the apps/bench query-state parser, and to the BenchScriptName Extract narrow in bench-types. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a single-event autosize latency helper that awaits the adapter's autosize callback and one rAF, reporting interaction_latency_ms as "call-to-paint" timing. Mirrors the shape of measureBenchKeySequenceRun. Also unblocks the now-accepted "autosize" script in the query-state parser by retargeting the existing fallback-to-defaults test to an unrelated bogus value. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pretable, AG Grid, and MUI adapters now publish their autosize entry
point through a new onAutosizeReady callback. bench-app.tsx captures it
in autosizeApiRef and dispatches measureBenchAutosizeRun on the autosize
script, mirroring the updateApiRef + measureBenchUpdatesRun chain.
Replaces AG Grid's pre-emptive onGridReady autosize branch (which only
ran at mount) with a callback so autosize fires on bench-script
dispatch. MUI now exposes apiRef via useGridApiRef so the harness can
call apiRef.current.autosizeColumns({ includeOutliers: true }) — async
on v7+. TanStack accepts the prop for harness uniformity but the
bench-runner returns "unsupported" before the adapter ever mounts.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds H22 ("pretable autosize is within a single 60Hz frame and within
10% of the best ag-grid/mui comparator on S2"). Reuses the H1
comparator-parity pattern: 16 ms single-frame floor, 10% parity band,
≥10 repeats per side before resolving a tight-zone (0.9–1.2) ratio.
Hoists COMPARATOR_PARITY_MIN_REPEATS to module scope so H1 and H22
share a single source of truth.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
S2/hypothesis/Chromium, all 13 scripts including autosize, repeats=3, ~5 min wall-clock. H22 satisfied: pretable autosize 5.3 ms vs MUI 11 ms (ratio 0.482, outside the tight zone — gate does not apply). H1 also flipped from failing → satisfied vs the 2026-05-08 milestone (parity at n=3 with mui this run; matches the n=20 correction documented in the previous repo-memory entry). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Vercel preview readyPreview: https://pretable-abfyptujy-cacheplane.vercel.app Updated automatically by the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires the
autosizescript end-to-end through the bench harness and adds the H22 comparator-parity hypothesis evaluator. Closes B2 follow-up #3 — the autosize gap captured in the 2026-05-08 milestone is now resolved.query-state,bench-types, andpackages/bench-runneracceptautosize(gated to S2 and topretable | ag-grid | mui; tanstack returnsunsupported).measureBenchAutosizeRun(root, adapterId, autosize)inbench-runtime.ts— single-event "call-to-paint" timing (await the callback, then one rAF), reportsinteraction_latency_ms. MirrorsmeasureBenchKeySequenceRunshape.grid.autosizeColumns()), AG Grid (gridApi.autoSizeColumns(colIds, false)), and MUI (apiRef.current.autosizeColumns({ includeOutliers: true })— async on v7+) each acceptonAutosizeReadyand call back with a closure over their native autosize API.bench-app.tsxcaptures it inautosizeApiRefand dispatches on the autosize script. AG Grid's old pre-emptive mount-time autosize branch is replaced by the callback. MUI now exposesapiRefviauseGridApiRef().evaluateH22(runs)inscripts/bench-matrix.mjs. Pretable autosize must complete within a 60Hz frame (≤ 16 ms) and within 10% of the best ag-grid/mui comparator on S2. Reuses H1's tight-zone min-repeat gate via a now-shared module-levelCOMPARATOR_PARITY_MIN_REPEATS = 10constant.Matrix re-run
S2/hypothesis/Chromium, all 13 scripts including autosize, repeats=3, ~5 min wall-clock. Output:
status/milestones/2026-05-09-b2-with-autosize.hypotheses.json(the original2026-05-08-b2-comparative-bench.hypotheses.jsonis unchanged).H22 status
satisfied — pretable 5.3 ms vs MUI 11 ms (ratio 0.482, comfortably below the tight zone, so the n=3 min-repeat gate does not apply). A 20-repeat re-run is not required because the verdict resolved outside the tight zone.
Other status changes vs the 2026-05-08 milestone: H1 flipped from
failingtosatisfied(parity at n=3 with mui this run; matches the n=20 correction documented in the previous repo-memory entry). No other hypotheses changed status.What's NOT in this PR
satisfiedoutside the tight zone at n=3./benchpage changes — the page renders only H1 today.Test plan
pnpm -w typecheckpnpm -w test(all suites pass; bench-matrix tests cover the 5 H22 scenarios — satisfied / failing-floor / failing-parity / insufficient-tight-zone / insufficient-no-pretable / directional-no-comparator)pnpm -w lintpnpm formatpnpm bench:matrixend-to-end re-run (3 repeats, 13 scripts, ~5 min)🤖 Generated with Claude Code