backport: feat(ai): optimize text accumulation runtime to O(N) by aayush-kapoor · Pull Request #15906 · vercel/ai

aayush-kapoor · 2026-06-08T18:49:25Z

Background

manual backport for #15897

Checklist

All commits are signed (PRs with unsigned commits cannot be merged)
Tests have been added / updated (for bug fixes / features)
Documentation has been added / updated (for bug fixes / features)
A patch changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
I have reviewed this pull request (self-review)

meitalbensinai · 2026-06-10T09:29:08Z

Validation results + report of a sibling site this PR does not cover

Validated this PR (specifically the v6 backport, #15906) against an in-the-wild reproduction of the bug it was filed for. Short version: the PR is correctly written and a real improvement, but the bug still fires on tool-input-heavy workloads because of a third O(N²) site in the same file that this PR does not address. Posting here so you can decide whether to expand scope before merging or land it with a tracking issue.

Stack used

opencode (anomalyco fork) at v1.15.12 source rebuild + this PR's patched ai@6.0.168
For "both fixes" group: also patched the sibling opencode/processor.ts site we filed in anomalyco/opencode#30072 (same chunked-text shape as this PR)
Model: minimax/minimax-m2.7 via OpenRouter (verbose-reasoning + heavy edit tool inputs — the workload class that triggers the bug)
Instance: protonmail/webclients SWE-bench-Pro instance 7e54526774… (heavyweight TS monorepo; historically a reliable bug-firer)
N=5 per group, sequential, same environment

Results

Config	n	resolved	med total	med mean s/step	med max-step	p100 max-step
Both fixes (this PR + opencode#30072)	5	5/5	692s	6.3s	88s	97s
This PR only	5	4/5	1075s	8.6s	83s	103s
Unpatched	5	5/5	801s	7.4s	91s	120s

Max-step distribution per group (sorted desc):

Both fixes: 97s, 89s, 88s, 86s, 48s
This PR only: 103s, 99s, 83s, 61s, 59s
Unpatched: 120s, 94s, 91s, 88s, 63s

Every single run, in every config, has at least one step taking 48–120s. In a healthy run, no individual LLM step should take >15s on this model. The signature is the bug still firing — just attenuated.

The third site

packages/ai/src/ui/process-ui-message-stream.ts:577

case 'tool-input-delta': {
  const partialToolCall = state.partialToolCalls[chunk.toolCallId];
  …
  partialToolCall.text += chunk.inputTextDelta;          // ← same pattern as text-delta / reasoning-delta

  const { value: partialArgs } = await parsePartialJson(
    partialToolCall.text,                                // ← forces flatten on every chunk
  );

Same text += shape this PR fixes for the text and reasoning branches in the same switch. MiniMax M2.7 is especially exposed because its edit tool calls carry multi-line diffs streamed in many small chunks; at step ~50+ in a long agent loop, the partialToolCall.text for an in-flight edit grows large enough that the per-chunk concat + parse hits quadratic time. That matches what we observe — runs clean for ~50 steps, then late-step spikes once tool-input streaming bytes have accumulated.

Why a naïve `prepareTextAccumulator` here is harder

The parsePartialJson(partialToolCall.text) call on every delta needs the cumulative string, so a lazy-join getter alone doesn't break the quadratic — every delta still flattens. Options:

Incremental partial-JSON parser that consumes deltas without rebuilding the full string each time (most correct, real engineering).
Buffer-and-flush: only run parsePartialJson every N deltas (or after a debounce window). Drops some UI smoothness, large perf win.
Chunk + soft-rejoin cap: store as chunks, only flatten when needed for parsePartialJson, but rejoin if _chunks.length exceeds a threshold to bound worst-case.

Happy to file the follow-up PR if you'd like — wanted to flag it before #15906/#15897 merge so the scope decision is informed. Either way, thanks for the clean lifecycle design on the existing fix; the WeakMap + explicit finalize is nicer than what we shipped on our side.

lgrammel · 2026-06-10T17:08:52Z

Added benchmark that reproduces quadratic effect with high number of chunks / small chunk size:

50k:   1088.913 ms
100k:  4106.260 ms
150k:  8736.345 ms
200k: 15703.122 ms
250k: 24122.616 ms

However, it does not show that the changes here fix it.

backport: feat(ai): optimize text accumulation runtime to O(N)

c6634bd

github-actions Bot assigned aayush-kapoor Jun 8, 2026

vercel Bot deployed to Preview June 8, 2026 18:50 View deployment

aayush-kapoor marked this pull request as draft June 9, 2026 18:13

meitalbensinai mentioned this pull request Jun 10, 2026

feat(ai): optimize text accumulation runtime to O(N) #15897

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backport: feat(ai): optimize text accumulation runtime to O(N)#15906

backport: feat(ai): optimize text accumulation runtime to O(N)#15906
aayush-kapoor wants to merge 1 commit into
release-v6.0from
aayush/backport-optimization

aayush-kapoor commented Jun 8, 2026

Uh oh!

meitalbensinai commented Jun 10, 2026

Uh oh!

lgrammel commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aayush-kapoor commented Jun 8, 2026

Background

Checklist

Uh oh!

meitalbensinai commented Jun 10, 2026

Validation results + report of a sibling site this PR does not cover

Stack used

Results

The third site

Why a naïve prepareTextAccumulator here is harder

Uh oh!

lgrammel commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Why a naïve `prepareTextAccumulator` here is harder

lgrammel commented Jun 10, 2026 •

edited

Loading