Status: research mode (no pushing). Last updated: 2025-12-17
- Task: “LSP Diagnostics Display Fix” marked complete; PR open (
fix/lsp-diagnostics-display, PR#186). - Related issue:
#185(broader NeXTSTEP tool panel renderer audit). - Hypothesis to validate: the agent is not using LSP feedback correctly (even if diagnostics are being produced).
Note: the current working tree on master does not contain the PR-described files (e.g. src/tunacode/ui/renderers/tools/diagnostics.py) at the time this doc was written, so this document maps what’s on disk now plus the likely integration points for the PR branch.
Tunacode’s LSP integration currently works by appending diagnostics to tool output after successful file writes/edits.
src/tunacode/tools/update_file.pyandsrc/tunacode/tools/write_file.pyare decorated with@file_tool(writes=True).src/tunacode/tools/decorators.py:file_tool(..., writes=True)runs the wrapped tool, then calls_get_lsp_diagnostics(filepath)and appends its formatted output to the returned string._get_lsp_diagnostics()calls:src/tunacode/lsp/__init__.py:get_diagnostics()→ orchestrates server selection + client lifecyclesrc/tunacode/lsp/client.py:LSPClient.get_diagnostics()→didOpenthen waits forpublishDiagnosticssrc/tunacode/lsp/__init__.py:format_diagnostics()→ formats as:<file_diagnostics>…</file_diagnostics>(XML-ish wrapper, plain text lines inside)
The only intentional “agent feedback loop” is: diagnostics are part of the tool return string, so they become part of the next model request.
Concretely:
- Tool returns are stored as
tool-returnparts inside the nextnode.request. src/tunacode/core/agents/agent_components/node_processor.pyappendsnode.requestintostate_manager.session.messages, so the persisted message history includes those tool returns.- The next model call receives the tool-return parts as part of the conversation context (pydantic-ai’s normal tool loop).
There is no additional core logic today that:
- parses
<file_diagnostics>into structured data, - promotes diagnostics to a system message,
- blocks completion until diagnostics are addressed,
- or otherwise “forces” the agent to act on diagnostics.
The UI displays tool results via tool_result_callback:
src/tunacode/ui/app.py:build_tool_result_callback()postsToolResultDisplay(tool_name, status, args, result, …).- In
src/tunacode/core/agents/agent_components/node_processor.py, tool results are displayed by iteratingtool-returnparts innode.requestand callingtool_result_callback(...).
Important limitation: in that display path, args={} is passed because tool-return parts do not carry original args in this implementation. That means:
- UI renderers that rely on args (e.g.,
filepath) will often fall back to"unknown". - Any “zone renderer” that wants to tie diagnostics to a file path must recover it from the tool return text (diff headers, embedded “File ‘…’ updated” strings, etc.), or the architecture must be changed to persist args alongside tool returns.
These are ranked by likelihood / leverage.
src/tunacode/lsp/servers.py:get_server_command() returns None if the server binary is missing (shutil.which check). If that happens:
get_diagnostics()returns[]format_diagnostics()returns""- nothing is appended to the tool output
- the model gets no LSP feedback at all
Quick check:
which pyright-langserverwhich pylsp
Observed (this machine, at time of writing):
pyright-langserver: missingpylsp: missingrust-analyzer: present
On master, src/tunacode/ui/renderers/tools/update_file.py renders only the diff + metadata. It does not parse or show <file_diagnostics>.
If diagnostics are appended after the diff, they may exist in raw tool output but be effectively hidden in the TUI.
Current format_diagnostics() includes the raw diag.message and does not:
- truncate to first line,
- clamp to a max character length,
- add a summary line,
- or limit by
settings.lsp.max_diagnostics(it currently hard-limits to 20 insideformat_diagnostics, ignoring config).
Failure mode: the diagnostic payload becomes large and/or repetitive; the model deprioritizes it compared to surrounding context (diff, instructions, etc.).
Potential protocol issues to validate:
LSPClient.open_file()always sendstextDocument/didOpenwithversion=1and never sendsdidChangeordidClose.- Some servers may ignore repeated
didOpenfor an already-open URI, or rely ondidChangefor updates. - The orchestrator roots the server at
root=path.parent; that can prevent pyright from finding project-level config (e.g.pyproject.toml) depending on repository layout.
If diagnostics are inconsistent, the agent can’t rely on them.
Tool prompts are loaded from src/tunacode/tools/prompts/*_prompt.xml and assigned to tool functions via decorators.base_tool().
At time of writing, at least:
src/tunacode/tools/prompts/update_file_prompt.xmldescribes args that do not matchupdate_file(filepath, target, patch).src/tunacode/tools/prompts/write_file_prompt.xmlclaims overwrite behavior that contradictswrite_file(it fails if file exists).
Even if unrelated to diagnostics directly, prompt mismatch increases the chance the model does the wrong thing after receiving diagnostics (or fails to interpret tool results correctly).
src/tunacode/tools/decorators.py:_get_lsp_config() reads DEFAULT_USER_CONFIG rather than the active state_manager.session.user_config.
Implications:
- User-level toggles may not apply.
- Tuning knobs like
max_diagnosticsaren’t actually controlling output.
This can create a gap between “what we think we configured” and what’s actually happening at runtime.
- Confirm whether diagnostics are appended at all:
- Trigger a known type error via
update_fileon a.pyfile. - Inspect the raw tool return string for
<file_diagnostics>.
- Trigger a known type error via
- Confirm whether the model is receiving them:
- Add temporary logging around tool-return parts in
node_processor.py(or inspectstate_manager.session.messages) to verify diagnostics survive into the next request.
- Add temporary logging around tool-return parts in
- Confirm LSP server availability + stability:
- Ensure
pyright-langserverorpylspexists, and validate repeated edits still yield fresh diagnostics.
- Ensure
- Decide on a “forcing function” if needed:
- If we want the agent to always react to diagnostics, we likely need core logic that detects
<file_diagnostics>in the latest tool return and injects a short system-level instruction like: “Fix listed diagnostics before proceeding.”
- If we want the agent to always react to diagnostics, we likely need core logic that detects
- LSP injection:
src/tunacode/tools/decorators.py - Orchestrator:
src/tunacode/lsp/__init__.py - Client protocol:
src/tunacode/lsp/client.py - Server mapping:
src/tunacode/lsp/servers.py - Tool loop + UI display:
src/tunacode/core/agents/agent_components/node_processor.py,src/tunacode/ui/app.py - Context pruning:
src/tunacode/core/compaction.py - Tool docs/prompts:
src/tunacode/tools/prompts/update_file_prompt.xml,src/tunacode/tools/prompts/write_file_prompt.xml
Contrary to earlier check, pyright-langserver WAS present and working:
- UI status bar showed "LSP: pyright"
- Diagnostics were being produced and displayed
The update_file renderer shows diagnostics as raw text at the bottom of the diff panel:
<file_diagnostics>
Error (line 160): No overloads for "push_screen" match the provided arguments
Error (line 160): Argument of type "(completed: bool) -> None" cannot be assigned...
</file_diagnostics>
Diagnostics are not styled/parsed - they appear as plain XML text within the tool output. But they ARE visible.
Observed pyright errors include full type explanations:
Error (line 160): Argument of type "(completed: bool) -> None" cannot be assigned to parameter "callback" of type "ScreenResultCallbackType[ScreenResultType@push_screen] | None" in function "push_screen"
Type "(completed: bool) -> None" is not assignable to type "((bool | None) -> None) | ((bool | None) -> Awaitable[None]) | None"
Type "(completed: bool) -> None" is not assignable to type "(bool | None) -> None"
Parameter 1: type "bool | None" is incompatible with type "bool"
Type "bool | None" is not assignable to type "bool"
"None" is not assignable to type "bool"
This verbosity likely causes the model to deprioritize or skim the diagnostic content.
Traced the full flow:
update_filereturns diff string@file_tool(writes=True)decorator appends<file_diagnostics>block via_get_lsp_diagnostics()- Combined string stored in
tool-returnpart content - LLM receives full string in next request via pydantic-ai tool loop
- UI callback also receives same string for display
The agent DOES receive all LSP diagnostics. The full verbose output goes into the context window. Whether it acts on them is a model behavior issue, not architecture.
| Tool | Type | Output Style |
|---|---|---|
| Pyright | Type checker | Verbose nested type explanations |
| Ruff | Linter | Clean, actionable lint errors |
Running ruff check on the same file with pyright type errors:
All checks passed!
Ruff doesn't do type checking - it only does linting. The pyright errors were type signature mismatches, not lint issues.
Rationale:
- Ruff is the modern standard for Python tooling
- Cleaner, less verbose output
- Focuses on actionable issues the agent can fix
- Type errors that don't break runtime are lower priority than clear feedback
Change made:
src/tunacode/lsp/servers.py - Python files now use ruff server --stdio only (removed pyright and pylsp fallbacks).
".py": (
"python",
[
["ruff", "server", "--stdio"],
],
),- Will agents act on ruff feedback better than pyright? - Needs testing
- Should we add type checking back as optional? - Maybe as user config toggle
- UI styling for diagnostics - Lower priority, but would improve UX