perf(ui): render streamed markdown incrementally#11038
Merged
Merged
Conversation
Contributor
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
SUGGESTION
Resolved Issues (fixed in new commits)
Files Reviewed (7 files)
Fix these issues in Kilo Cloud Reviewed by claude-4.6-sonnet-20260217 · 588,104 tokens Review guidance: REVIEW.md from base branch |
imanolmzd-svg
approved these changes
Jun 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Long streamed responses repeatedly parsed, sanitized, rebuilt, and diffed the entire accumulated Markdown document for every visible update. The cost grows with response length even though earlier top-level blocks are already complete and cannot change.
This change splits streaming Markdown into stable completed blocks and one mutable tail. Completed blocks retain stable cache keys and mounted DOM nodes, while only the active tail is reparsed and replaced. Reference-style Markdown and unfinished fenced code retain the conservative existing path, and completion still performs the canonical full render. Sanitization, syntax highlighting, Mermaid rendering, copy controls, and streaming cadence remain unchanged.
The DOM implementation and block splitting live in Kilo-owned files. Shared upstream files contain only the narrow hooks needed to return block metadata and invoke the incremental renderer, with
kilocode_changeannotations on every changed line.Why this is worth doing
Markdown rendering becomes more important as responses and sessions grow. With the original renderer, every streamed update revisits the full accumulated response, so work increases throughout a long answer even though almost all earlier blocks are already final. In a large session, that repeated parsing, sanitization, temporary DOM creation, and reconciliation also competes with the retained transcript, tool cards, timeline, scrolling, and background session updates on the same renderer thread.
This optimization changes the growing part of that workload from the full response to the active Markdown tail. The benefit therefore compounds where it matters most: long answers, documentation-heavy work, code reviews, plans, and mature sessions with substantial transcript DOM. It is not a substitute for future transcript virtualization, but it removes avoidable per-update work before that retained-DOM cost is addressed. The reduction in transient nodes is particularly valuable because it lowers allocation and garbage-collection pressure rather than only moving CPU work elsewhere.
Profile results
The benchmark replayed the same Agent Manager workload for each implementation: 13,531 Markdown characters in 677 foreground chunks, two concurrent background sessions, and 1,353 total part updates. Every run settled to the same 540-element Markdown DOM and canonical completed HTML.
The final incremental implementation was profiled twice. Chromium script duration measured 462.7 ms and 473.4 ms, while HTML parsing measured 14.7 ms and 14.3 ms. The reported final values are the medians of those two confirmation runs.