docs(research): pretable vs MUI scroll perf diagnostic by blove · Pull Request #124 · cacheplane/pretable

blove · 2026-05-09T02:18:22Z

Summary

B2 follow-up #1: diagnosed the 1 ms scroll-frame-p95 gap between pretable and MUI X DataGrid Community on S2/hypothesis.

Phase A ran pnpm bench:matrix at 20 repeats for pretable + mui only.
Aggregated stats are in status/milestones/2026-05-09-perf-diag-high-repeat.scroll.json.
Phase B trace capture was skipped because Phase A's verdict was noise.
Phase C wrote the research memo at docs/research/2026-05-09-pretable-vs-mui-scroll-perf.md.

Verdict

noise

Pretable averaged 9.07 ms; MUI averaged 9.14 ms. The mean gap was -0.065 ms (pretable - mui) against a 2σ noise floor of 0.401 ms, so the original B2 n=3 MUI advantage did not survive high-repeat measurement.

What's NOT in this PR

Code fixes to pretable's scroll path.
Other scenarios (S5 streaming, S7 filter-metadata).
Other browsers (WebKit, Firefox).
Updates to H1's evaluator threshold.

Test plan

pnpm --filter @pretable/app-bench build
pnpm bench:matrix --project=chromium --adapters=pretable,mui --scenarios=S2 --scripts=scroll --scale=hypothesis --repeats=20
pnpm -w typecheck
pnpm -w test
pnpm -w lint
pnpm format
Spec compliance subagent review approved

…iagnostic

Phase C of B2 follow-up #1. Verdict: gap is noise. The high-repeat S2/hypothesis/scroll rerun shows no meaningful MUI advantage and recommends tightening H1-sensitive repeat protocol instead of scoping a perf-fix PR. Spec: docs/superpowers/specs/2026-05-09-b2-followup-perf-diagnostic-design.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

vercel · 2026-05-09T02:18:27Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
pretable	Ready	Preview, Comment	May 9, 2026 2:19am

github-actions · 2026-05-09T02:22:37Z

Vercel preview ready

Preview: https://pretable-mrdvgvitb-cacheplane.vercel.app
Commit: 2a4b269777b9761da8d451e03a3b9181b03492e2

_{Updated automatically by the deploy-preview job.}

#125) PR #124 (perf-diag rerun at n=20) showed the B2 H1 "failing" verdict was a low-sample artifact: pretable 9.07 ms ± 0.20 vs MUI 9.14 ms ± 0.19, mean diff −0.065 ms inside the 2σ noise floor of 0.40 ms. The original n=3 ratio of 1.115 was sample noise, not a real regression. Five targeted corrections: - Add status/milestones/2026-05-09-b2-h1-high-repeat-correction.json overlaying the original B2 evidence with the n=20 result and correctedH1.status = "satisfied". Original B2 milestone left intact. - Rewrite the apps/website/app/bench/page.tsx prose to a parity framing at high repeats. verdictFor now respects a parityAdapters set so the table doesn't crown a "fastest" off n=3 noise; H1 status reflects the corrected verdict. - Add a min-repeat gate to scripts/bench-matrix.mjs evaluateH1: when the pretable / best-full-grid frame-p95 ratio is in the tight zone (0.9 ≤ r ≤ 1.2) AND either adapter has < 10 repeats, return insufficient with guidance to re-run at --repeats=10+. Outside the tight zone the existing behavior is unchanged. New test covers the insufficient case; existing failing test rewritten to use a clearly out-of-zone ratio (1.6) so the failing path stays exercised. - Append a 2026-05-09 entry to docs/research/repo-memory.md overturning the H1 flip narrative, with the new evaluator gate documented. AG Grid (16.7 ms p95, 1 blank gap, 2 px row-height drift) and TanStack (16.7 ms p95, 1 blank gap) status from the B2 n=3 runset is not corrected here — both are >50% above pretable, well outside the noise zone. They remain ~1.7× pretable's scroll_frame_p95_ms with quality gaps that pretable does not have. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…rdict) (#133) * docs(specs): pretable scroll-with-render perf diagnostic design Three-phase research PR mirroring PR #124's pattern: high-repeat (n=20) re-run, conditional Playwright trace capture, research memo. Diagnoses whether the PR #130 cheap-render anomaly (16.4 ms vs 10.3 ms for format and heavy-render) is real or a low-sample artifact. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(plans): pretable scroll-with-render perf diagnostic plan Seven-task plan mirroring PR #124's three-phase pattern: n=20 matrix re-run, conditional Playwright trace capture, research memo. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(research): pretable scroll-with-render perf diagnostic memo Verdict: noise. The PR #130 cheap-render anomaly (16.4 ms vs ~10.3 ms for format and heavy-render at n=3) was a sampling artifact. At higher repeats, scroll-with-render is at parity with (in fact marginally faster than) the other two: | Script | n | mean (ms) | σ (ms) | | -------------------------- | --: | --------: | -------: | | scroll-with-format | 8 | 9.36 | 0.80 | | scroll-with-render | 7 | 8.97 | 0.35 | | scroll-with-heavy-render | 6 | 9.15 | 0.13 | Both 2σ pairs (cheap-vs-format, cheap-vs-heavy) are well within the noise floor. Same shape as PR #124's finding at larger magnitude. The matrix run completed only ~36% of planned repeats (Playwright flake; not investigated) but the observed σ values make the verdict unambiguous — PR #130's 6 ms gap is ~21σ away from the observed distribution. No perf-fix PR needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: prettier-format perf-diag artifacts --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…get on pretable filter-text) (#134) * docs(specs, plans): interaction borderline perf diagnostic Tightens PR #131's two borderline numbers (pretable filter-text 17.7 ms, tanstack vs pretable filter-metadata 15.7/16.0 ms) via n=20 re-run. Mirrors PR #124 / PR #133's pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(research): interaction borderline perf diagnostic memo Verdict: pretable filter-text is real-over-budget (16.79 ± 0.31 ms at n=20); tanstack vs pretable filter-metadata is noise-tied (1.6 ms mean diff vs 23 ms 2σ noise floor — tanstack's σ at n=8 is 11.6 ms). Incidental finding: pretable filter-metadata is also over budget at the mean (17.51 ± 2.44 ms). PR #131's 16.0 ms n=3 reading was a low-end sample. Homepage prose claims filter-metadata is "clear of" the single-frame budget; that's no longer accurate. Three recommendations queued (editorial cleanup + perf-fix investigation); see memo for details. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: prettier-format borderline diag artifacts --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* docs(specs): pretable wrapped-text filter perf diagnostic design Three-phase pattern (trace + analyze + memo) with conditional fix in same PR. Mirrors PR #124 / PR #133 with a Phase D escape hatch for single-cause low-risk fixes. Targets pretable's interaction scripts landing 1-2 ms over the 16 ms single-frame budget on Chromium S2/hypothesis (PR #134 finding). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(plans): pretable wrapped-text filter perf diagnostic plan Five-task plan: trace capture, manual analysis, memo, conditional fix, gates + PR. Auto-merge if memo-only; hold if a fix ships. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(bench): Playwright trace for pretable filter-text perf diagnostic * docs(research): pretable wrapped-text filter perf diagnostic memo * style(docs): prettier format perf diagnostic memo + plan + spec * chore: remove force-added trace binary; project convention is no committed traces The trace was captured locally per the spec but force-added past .gitignore — the project's standing pattern is `status/traces/*.zip` is gitignored. Removing the binary; memo updated to note the trace is local-only and that Playwright's default action-trace format doesn't capture per-function timeline data anyway (the real blocker for this investigation, separately documented in the memo's "bench-harness gap" follow-up). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(research): memo references local-only trace; no committed binary The trace path reference in the memo was pointing at a binary that was force-added past .gitignore and then removed in the prior commit. Update the reference to note the trace is local-only and the Playwright action-trace format wouldn't have given flame-graph data anyway. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

blove and others added 3 commits May 8, 2026 19:11

chore(bench): high-repeat S2/scroll milestone for B2 follow-up perf d…

484846c

…iagnostic

chore(bench): format perf diagnostic artifacts

2a4b269

blove enabled auto-merge (squash) May 9, 2026 02:18

blove merged commit b5ea678 into main May 9, 2026
11 checks passed

blove deleted the b2-followup-1-perf-diag branch May 9, 2026 02:20

blove mentioned this pull request May 9, 2026

fix(bench): correct H1 status and gate parity check on minimum repeats #125

Merged

5 tasks

blove mentioned this pull request May 11, 2026

docs(research): pretable scroll-with-render perf diagnostic (noise verdict) #133

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(research): pretable vs MUI scroll perf diagnostic#124

docs(research): pretable vs MUI scroll perf diagnostic#124
blove merged 3 commits into
mainfrom
b2-followup-1-perf-diag

blove commented May 9, 2026

Uh oh!

vercel Bot commented May 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blove commented May 9, 2026

Summary

Verdict

What's NOT in this PR

Test plan

Uh oh!

vercel Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 9, 2026

Vercel preview ready

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 9, 2026 •

edited

Loading