Skip to content

perf: O(N²) quadratic memmove in processUIMessageStream + DefaultStreamTextResult on long thinking-mode streams #15670

@meitalbensinai

Description

@meitalbensinai

Summary

`processUIMessageStream` and `DefaultStreamTextResult` accumulate streaming text and reasoning deltas with `part.text += chunk.delta`. On long thinking-mode streams (Anthropic extended-thinking, MiniMax M2, GPT o-series, Qwen3-thinking, etc.) this is O(N²) in cumulative text length: V8/JSC try to keep the rope lazy, but any read of `.text` between writes (UI render loop, NDJSON serializer, anything that flattens) forces re-allocation + memcpy of the prior content on every subsequent `+=`.

Same bug shape as #14619 (already fixed for `@ai-sdk/mcp`), now manifesting in the streaming-text accumulators of the UI/SDK core.

Reproducer

import { processUIMessageStream, createStreamingUIMessageState } from 'ai';

const N = 10000;
const CHUNK = 'x'.repeat(200);

const stream = new ReadableStream({
  start(c) {
    c.enqueue({ type: 'text-start', id: '1' });
    for (let i = 0; i < N; i++) c.enqueue({ type: 'text-delta', id: '1', delta: CHUNK });
    c.enqueue({ type: 'text-end', id: '1' });
    c.close();
  }
});

const state = createStreamingUIMessageState({ lastMessage: undefined, messageId: 'm1' });

const t0 = performance.now();
const out = processUIMessageStream({
  stream,
  runUpdateMessageJob: (job) => job({ state, write: () => {} }),
  onError: (e) => { throw e; },
});
const reader = out.getReader();
while ((await reader.read()).done === false) {}
console.log(\`elapsed: \${(performance.now() - t0).toFixed(0)} ms\`);
// On ai@6.0.168: ~25 s wall (quadratic at N=10000)
// Expected:      ~1 s wall (linear)

Effect on real workloads

Observed during agentic coding sessions (opencode, MiniMax-M2.7 model, SWE-bench-Pro):

Instance Wall time Notes
flipt-21a935 (47 steps) 1666 s per-step 35.4 s
vuls-7e91f5 7200 s HARD TIMEOUT main JS thread at 100% CPU in libc memmove

CPU profile of the hung process (`perf record -F 99 -g -p <bun_pid>` for 60 s):

```
30.27% bun libc.so.6 [.] 0x16414c ← __memmove_avx_unaligned_erms (allocator memcpy)
+ 7× HeapHelper GC threads at 4.0-4.3% CPU each (JSC GC pinned)
+ main bun thread at 85% CPU state R, never recovering
```

This pattern is fully consistent with a quadratic string-concatenation in a hot streaming path.

Sites

  • `packages/ai/src/ui/process-ui-message-stream.ts`:
    • `case 'text-delta'`: `textPart.text += chunk.delta;`
    • `case 'reasoning-delta'`: `reasoningPart.text += chunk.delta;`
  • `packages/ai/src/generate-text/stream-text.ts` (in `DefaultStreamTextResult`):
    • `text-delta` branch: `activeText.text += part.text;`
    • `reasoning-delta` branch: `activeReasoning.text += part.text;`

A fifth site (`partialToolCall.text += chunk.inputTextDelta` in `tool-input-delta`) has the same issue, but the next line reads the cumulative text for partial-JSON parsing on every delta, so it needs a different approach (incremental partial-JSON parser). Leaving that for a follow-up.

Proposed fix

PR #15669 — chunks-array + lazy `.text` getter. Preserves public API (consumers can still read `.text` mid-stream for progressive UI rendering) while keeping the accumulator O(N) total.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions