| #8524 |
feat(agent-manager): add GitHub PR status badge to worktree sidebar |
None |
Pure UI in extension layer, no session/SSE interaction |
| #8525 |
fix(vscode): retry transient fetch failures in session.command and promptAsync |
Low |
Wraps read-only SDK calls with bounded retry (3 attempts). Does NOT touch promptAsync/session.command despite title. Minor: retry delay not abortable (~1s max) |
| #8431 |
fix: remove local storage from ignored folder |
None |
Changes file ignore patterns only, no session/lock interaction |
| #8445 |
perf(snapshot): add mutex lock, incremental add, and batched revert |
Medium |
⚠️ TOP SUSPECT. Adds Lock.write(git) to ALL snapshot functions. diffFullUncached can hold the write lock for extended periods while streaming per-file git processes, blocking all other snapshot operations (track(), patch()) on the same worktree. Could cause sessions to appear stuck after tool calls complete (waiting for track() blocked behind a long diffFull). |
| #8479 |
fix(core): make follow-up execution aware of the saved plan file |
None |
Pure prompt text change, no control flow modification |
| #8508 |
fix(vscode): scope cycleAgentMode keybinding to Kilo Code panels |
None |
Keybinding scoping only |
| #8506 |
feat(vscode): pre-release publishing support |
None |
CI/CD change only, no runtime code |
| #8496 |
fix: local review git |
None |
Prompt text + one bounded read-only git command with .nothrow() |
| #8480 |
feat(vscode): reimplement task timeline graph header |
None |
Pure UI visualization, read-only session data |
| #8484 |
feature: glm/kimi/qwen reasoning support |
None |
Declarative config mapping, no stream handling changes |
| #8190 |
fix(agent-manager): suppress interactive prompts during background git fetch |
None |
Actually reduces hang risk by using GIT_TERMINAL_PROMPT=0 and BatchMode=yes |
| #8464 |
fix(cli): update simple-git to fix critical RCE |
None |
Security hardening only; CLI uses raw git commands, not simple-git for core ops |
| #8465 |
fix(cli): update hono to fix auth bypass and server vulnerabilities |
Low |
Hono SSE injection fix touched streaming internals. Canonical usage patterns (writeSSE/onAbort) are stable, but hono serves the SSE endpoints |
| #8466 |
fix(cli): update minimatch, @modelcontextprotocol/sdk, and @aws-sdk |
None |
MCP SDK 1.24.3 actually fixed a hanging connection bug (body.cancel() → text()) |
| #8467 |
fix: add safe overrides for transitive dependency vulnerabilities |
None |
Semver-compatible overrides, security fixes only |
| #8426 |
fix(vscode): mode picker sync |
None |
UI-layer mode picker synchronization, proper error recovery |
| #8417 |
fix(vscode): question recovery + sub-agent permission propagation |
Low |
Designed to FIX hangs. Adds question recovery on SSE reconnect and auto-adoption of child sessions. Minor: fetchAndSendPendingQuestions is awaited during SSE reconnection handler — if backend is unresponsive, could delay reconnection completion |
| #8386 |
fix(vscode): recover missed child-session prompts |
None |
Explicitly fixes hanging sessions from missed child prompts. Defensive error handling (try/catch, fire-and-forget) |
| #8400 |
fix(cli): cache diffFull and ignore legacy local storage |
None |
Promise-based caching reduces lock contention. Correct error eviction. |
| #8367 |
Session migration improvements |
None |
VS Code extension migration wizard only, not runtime sessions |
| #8230 |
feat(vscode): add Claude Code compatibility toggle |
None |
Env var at spawn time, no async flow changes |
| Plan mode commits (6 commits) |
Permission propagation to sub-agents |
Low |
Sub-agents inherit restrictive permissions. edit: "ask" creates blocking permission requests that surface to UI. Theoretical: deeply nested sub-agent permission request might be less visible to user, making parent appear hung |
| #8218 |
feat(cli): add org support for /kiloclaw command |
None |
Single parallel HTTP request with .catch(() => null) guard |
| CLI fix commits (5 commits) |
Guard prompt injection, review scope, health logging, Docker MCP --rm, variant config |
None |
Prompt text changes, log filtering, Docker cleanup fix |
| UI/stability commits |
Dialog escape, popover stability, JetBrains plugin |
None |
UI-only or isolated new package |
| Docs PRs (10 PRs) |
Various documentation updates |
None |
No runtime code changes |
| Security PRs (#8468, #8469) |
diff, dompurify, yaml, solid-js, vite, electron |
None |
Patch-level dependency updates, build-time tools |
| #8192, #8211, #8258 |
Agent session PRs (User-Agent header, docs link, health logging) |
None |
Header construction, static UI, log filtering |
Summary
Users report that both CLI and VS Code have "hanging sessions" — the session isn't doing anything, but sending a message does nothing. This issue tracks a systematic investigation of all PRs merged between v7.1.20 and v7.2.0 to identify potential causes.
Investigation Methodology
Each of the 39 PRs merged between v7.1.20 and v7.2.0 was analyzed against the following known hanging risk vectors in the codebase:
prompt.tsnever being resolved — If the session loop exits without finding an assistant message and without aborting, queued promise callbacks are never resolvedLock.write(git), serializing concurrent git operationsSessionPrompt.prompt()for child; if child hangs, parent hangsawait stream.writeSSE()blocks if client can't consume fast enoughAbortControlleris never aborted on cleanup, retry sleeps and LLM streams run indefinitelyPR-by-PR Analysis
Lock.write(git)to ALL snapshot functions.diffFullUncachedcan hold the write lock for extended periods while streaming per-file git processes, blocking all other snapshot operations (track(),patch()) on the same worktree. Could cause sessions to appear stuck after tool calls complete (waiting fortrack()blocked behind a longdiffFull)..nothrow()GIT_TERMINAL_PROMPT=0andBatchMode=yesbody.cancel()→text())fetchAndSendPendingQuestionsisawaited during SSE reconnection handler — if backend is unresponsive, could delay reconnection completionedit: "ask"creates blocking permission requests that surface to UI. Theoretical: deeply nested sub-agent permission request might be less visible to user, making parent appear hung.catch(() => null)guardConclusion: Likely Suspects
Primary Suspect: PR #8445 — Snapshot mutex lock (Medium probability)
This PR adds
Lock.write(git)to every public function in theSnapshotnamespace. While this correctly prevents concurrent git corruption, it introduces significant lock contention that could cause perceived hanging:diffFullUncached()holds the write lock while streaming git output line-by-line and spawning up to 4 additional git processes per file. For large diffs, this could hold the lock for seconds to minutes.track()(called at every step-start/step-finish),patch(),revert(),diff(),cleanup().track()to checkpoint — iftrack()is blocked behind a longdiffFullUncached(), the session appears stuck between steps.Recommended investigation: Add timing instrumentation to Lock.write(git) acquisitions and log when wait time exceeds 5 seconds. Check if
diffFullUncachedis the primary lock holder during perceived hangs.Secondary Suspects
Hono update (fix(cli): update hono to fix auth bypass and server vulnerabilities #8465) — Low probability, but hono serves the SSE endpoints. The SSE injection fix in hono 4.12 touched streaming internals. If edge cases in SSE stream termination/cleanup changed, it could affect how clients perceive connection state.
Plan mode permission propagation — Low probability. Sub-agents that inherit restrictive
edit: "ask"permissions create blocking permission requests. In deeply nested delegation chains, the permission UI might not be prominently visible, making the session appear hung while actually waiting for user input.Question recovery on SSE reconnect (fix(cli): harden plan mode permissions and propagate restrictions to sub-agents #8417) — Low probability.
fetchAndSendPendingQuestionsisawaited in the SSE "connected" handler. If the backend is slow, this delays SSE reconnection completion. However, SDK calls have timeouts.Pre-existing Risk (Not introduced in this range)
The session prompt loop in
prompt.tshas a known risk where queued callbacks (lines 304-309, 806-816) could never be resolved if the loop exits without finding an assistant message and without the abort signal firing. This is a pre-existing architectural issue that could manifest as permanent hangs regardless of these PRs.