Skip to content

feat(ai): optimize text accumulation runtime to O(N)#15897

Draft
aayush-kapoor wants to merge 8 commits into
mainfrom
aayush/runtime-optimization-chunk
Draft

feat(ai): optimize text accumulation runtime to O(N)#15897
aayush-kapoor wants to merge 8 commits into
mainfrom
aayush/runtime-optimization-chunk

Conversation

@aayush-kapoor

@aayush-kapoor aayush-kapoor commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Background

reported in #15670

the old code does this on every delta:

textPart.text += chunk.delta;

which step by step looks like:

start: text = ''

delta 'Hel':
text = '' + 'Hel'
=> 'Hel'

delta 'lo ':
text = 'Hel' + 'lo '
=> copies 'Hel', then appends 'lo '
=> 'Hello '

delta 'wor':
text = 'Hello ' + 'wor'
=> copies 'Hello ', then appends 'wor'
=> 'Hello wor'

delta 'ld':
text = 'Hello wor' + 'ld'
=> copies 'Hello wor', then appends 'ld'
=> 'Hello world'

for tiny strings this is fine. for long streams, the copied prefix keeps growing.

It gets worse when something reads .text between writes because that forces the JS engine to materialize/flatten the string before the next append.

Summary

the new code keeps chunks internally as

chunks.push(chunk.delta);

which step by step looks like:

start:
chunks = []
cachedText = ''

delta 'Hel':
chunks = ['Hel']

delta 'lo ':
chunks = ['Hel', 'lo ']

delta 'wor':
chunks = ['Hel', 'lo ', 'wor']

delta 'ld':
chunks = ['Hel', 'lo ', 'wor', 'ld']

so when someone reads:
textPart.text

the getter does:

chunks.join('')
=> 'Hello world'

Then it caches that:

chunks = ['Hello world']
cachedText = 'Hello world'

If .text is read again before a new delta arrives, it returns the cached string.

Manual Verification

na but have asked some providers to verify the fix by applying patch manaully

Checklist

  • All commits are signed (PRs with unsigned commits cannot be merged)
  • Tests have been added / updated (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
  • A patch changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
  • I have reviewed this pull request (self-review)

Future Work

investigate to see if similar patterns exist in the codebase

Related Issues

fixes #15670

Comment thread packages/ai/src/util/text-accumulator.ts Outdated
@lgrammel

lgrammel commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

we would need some benchmarks that show that this is an actual improve (and can be used to check for regressions)

@meitalbensinai

Copy link
Copy Markdown

Validation results + report of a sibling site this PR does not cover

Validated this PR (specifically the v6 backport, #15906) against an in-the-wild reproduction of the bug it was filed for. Short version: the PR is correctly written and a real improvement, but the bug still fires on tool-input-heavy workloads because of a third O(N²) site in the same file that this PR does not address. Posting here so you can decide whether to expand scope before merging or land it with a tracking issue.

Stack used

  • opencode (anomalyco fork) at v1.15.12 source rebuild + this PR's patched ai@6.0.168
  • For "both fixes" group: also patched the sibling opencode/processor.ts site we filed in anomalyco/opencode#30072 (same chunked-text shape as this PR)
  • Model: minimax/minimax-m2.7 via OpenRouter (verbose-reasoning + heavy edit tool inputs — the workload class that triggers the bug)
  • Instance: protonmail/webclients SWE-bench-Pro instance 7e54526774… (heavyweight TS monorepo; historically a reliable bug-firer)
  • N=5 per group, sequential, same environment

Results

Config n resolved med total med mean s/step med max-step p100 max-step
Both fixes (this PR + opencode#30072) 5 5/5 692s 6.3s 88s 97s
This PR only 5 4/5 1075s 8.6s 83s 103s
Unpatched 5 5/5 801s 7.4s 91s 120s

Max-step distribution per group (sorted desc):

  • Both fixes: 97s, 89s, 88s, 86s, 48s
  • This PR only: 103s, 99s, 83s, 61s, 59s
  • Unpatched: 120s, 94s, 91s, 88s, 63s

Every single run, in every config, has at least one step taking 48–120s. In a healthy run, no individual LLM step should take >15s on this model. The signature is the bug still firing — just attenuated.

The third site

packages/ai/src/ui/process-ui-message-stream.ts:577

case 'tool-input-delta': {
  const partialToolCall = state.partialToolCalls[chunk.toolCallId];
  
  partialToolCall.text += chunk.inputTextDelta;          // ← same pattern as text-delta / reasoning-delta

  const { value: partialArgs } = await parsePartialJson(
    partialToolCall.text,                                // ← forces flatten on every chunk
  );

Same text += shape this PR fixes for the text and reasoning branches in the same switch. MiniMax M2.7 is especially exposed because its edit tool calls carry multi-line diffs streamed in many small chunks; at step ~50+ in a long agent loop, the partialToolCall.text for an in-flight edit grows large enough that the per-chunk concat + parse hits quadratic time. That matches what we observe — runs clean for ~50 steps, then late-step spikes once tool-input streaming bytes have accumulated.

Why a naïve prepareTextAccumulator here is harder

The parsePartialJson(partialToolCall.text) call on every delta needs the cumulative string, so a lazy-join getter alone doesn't break the quadratic — every delta still flattens. Options:

  1. Incremental partial-JSON parser that consumes deltas without rebuilding the full string each time (most correct, real engineering).
  2. Buffer-and-flush: only run parsePartialJson every N deltas (or after a debounce window). Drops some UI smoothness, large perf win.
  3. Chunk + soft-rejoin cap: store as chunks, only flatten when needed for parsePartialJson, but rejoin if _chunks.length exceeds a threshold to bound worst-case.

Happy to file the follow-up PR if you'd like — wanted to flag it before #15906/#15897 merge so the scope decision is informed. Either way, thanks for the clean lifecycle design on the existing fix; the WeakMap + explicit finalize is nicer than what we shipped on our side.

Comment on lines +66 to +80
Object.defineProperty(part, '__textAccumulator', {
configurable: true,
value: accumulator,
});

Object.defineProperty(part, 'text', {
configurable: true,
enumerable: true,
get() {
return accumulator.getText();
},
set(value: string) {
accumulator.setText(value);
},
});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is weird why are those functions not methods on the class

switch (chunk.type) {
case 'text-start': {
const textPart: TextUIPart = {
const textPart = prepareTextAccumulator<TextUIPart>({

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the map of part to accumulator should be maintained in this function, not in text-accumulator (which accumulates a single text)

Comment on lines +42 to +49
type TextAccumulatorPart = {
text: string;
__textAccumulator?: TextAccumulator;
};

function getTextAccumulator<PART extends { text: string }>(part: PART) {
return (part as TextAccumulatorPart).__textAccumulator;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: O(N²) quadratic memmove in processUIMessageStream + DefaultStreamTextResult on long thinking-mode streams

3 participants