Skip to content

session/processor: text/reasoning delta accumulation is O(N²); long agent loops hang at 50-80 turns #30067

@meitalbensinai

Description

@meitalbensinai

Description

When I ran opencode in headless batch mode (opencode run --format json) against MiniMax-M2.7 on a long-horizon agent loop, runs that went past ~50 turns started slowing down dramatically per step, eventually hanging at the 2-hour wall-clock cap I had set. I expected per-step time to stay roughly flat, but it was growing: 6s → 30s → 60s → 100s+ per step as the session got longer.

When I attached perf record to the running bun process I saw something weird:

  • __memmove_avx_unaligned_erms in libc.so.6 was 30.27% of CPU samples
  • All 7 JSC HeapHelper GC threads were pinned at ~4% CPU each (normally idle)
  • Main JS thread at 85% CPU, state R, but no syscalls — pure userspace burn
  • All Bun Pool and HTTP Client threads idle at 0%

That __memmove pattern is the giveaway for V8/JSC rope-flattening on str += chunk. I tracked it down to two spots in packages/opencode/src/session/processor.ts:

  • reasoning-delta case: ctx.reasoningMap[value.id].text += value.text
  • text-delta case: ctx.currentText.text += value.text

The issue is that anything reading .text between writes (the UI render loop, the NDJSON serializer, the bus event broadcaster) forces V8/JSC to flatten the rope. Then the next += re-copies all the prior text — O(N²) in cumulative length per part. For thinking-mode models emitting 1500+ small reasoning tokens per turn, this destroys the agent loop after enough turns accumulate.

This is the same shape as the bug that was fixed in @ai-sdk/mcp in vercel/ai#14619 and that I just landed for vercel/ai#15669 in processUIMessageStream + DefaultStreamTextResult. But opencode's processor has its own second copy of the pattern, so even with the AI SDK fix, opencode still hangs.

The bug is not the same as #25094. That issue is about the TUI not showing a "thinking…" indicator during long reasoning blocks (a UX/visibility issue) and is observed in interactive sessions. This issue is an algorithmic O(N²) in the processor that affects every mode of opencode — headless batch, TUI, server — and is independent of whether thinking is shown to the user.

Distinct from related issues I checked:

Effect when patched (with the matching vercel/ai#15669 fix applied as well):

Run config per-step (avg) 2h timeouts
unpatched 61 s 5 of 18 (28%)
patched 21 s 0 of 53

Plugins

@tabnine/tabnine-provider (Tabnine LLM gateway provider). The bug reproduces regardless of provider — same pattern hit on local vLLM, OpenRouter, and Anthropic direct.

OpenCode version

1.15.11 (also confirmed on 1.15.12 install path)

Steps to reproduce

  1. Run a long agent loop with a thinking-mode model that emits many small reasoning deltas (MiniMax M2.x, DeepSeek-V4-thinking, Kimi K2, GPT-o, Qwen3-thinking — Claude extended-thinking is less severe but also affected)
  2. Drive it through 50+ turns (SWE-bench batch eval is an easy way to hit this naturally)
  3. Watch per-step latency grow within the run
  4. Attach perf record -F 99 -g -p <bun_pid> -- sleep 60 to a hung process and you'll see __memmove_avx_unaligned_erms dominating

Operating system

Debian 12 (in container; symptom reproduces on host too)

Terminal

Headless — opencode run --format json invoked from a Python batch runner. Symptom does not require a TUI.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions