Skip to content

feat: built-in terminal output filtering to reduce LLM token usage#11472

Draft
roomote[bot] wants to merge 1 commit intomainfrom
feature/terminal-output-filter
Draft

feat: built-in terminal output filtering to reduce LLM token usage#11472
roomote[bot] wants to merge 1 commit intomainfrom
feature/terminal-output-filter

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Feb 14, 2026

Related GitHub Issue

Closes: #11459

Description

This PR attempts to address Issue #11459 by implementing built-in CLI output filtering/compression to reduce token usage. Feedback and guidance are welcome.

Key implementation details:

  • New TerminalOutputFilter module (src/integrations/terminal/TerminalOutputFilter.ts): A filter engine with pluggable, command-aware rules. Each rule matches against the command string and transforms the output into a compact summary while preserving actionable information (errors, failures).

  • Built-in filters for common commands:

    • Test runners (jest, vitest, mocha, pytest, cargo test, go test): Extract pass/fail summary + failure details, strip passing test lines
    • git status: Compact file-change summary (staged/unstaged/untracked counts)
    • git log: One-line-per-commit format
    • Package managers (npm, yarn, pnpm, pip): Strip progress bars and download noise, keep warnings + final summary
    • Build tools (tsc, cargo build, webpack): Strip progress indicators, keep errors/warnings
  • Integration in ExecuteCommandTool.ts: Filter is applied at the point where output is formatted for the LLM tool result, for both inline (small) and persisted (large/truncated) output paths. A filter indicator is appended telling the LLM that filtering occurred and how to access full output via read_command_output.

  • New setting terminalOutputFilterEnabled (default: true): Added to GlobalSettings, ExtensionState, and the Terminal Settings UI as a checkbox toggle.

  • Safety mechanisms:

    • Only filters outputs >= 5 lines (very small outputs are not worth filtering)
    • Only applies filtering when it achieves >= 20% reduction (avoids mangling output for marginal gains)
    • Full output is always preserved via OutputInterceptor/read_command_output -- filtering only affects what the LLM sees
    • First matching rule wins (rules ordered by specificity)

Test Procedure

  • Added comprehensive test suite (src/integrations/terminal/__tests__/TerminalOutputFilter.spec.ts) with 20 test cases covering:

    • Small output bypass
    • Unrecognized command passthrough
    • Insufficient reduction passthrough
    • Jest/Vitest output with passing tests
    • Failure detail preservation
    • Various test runner command matching
    • Pytest output format
    • Cargo test output format
    • Git status (short format, clean working dir)
    • Git log compaction
    • npm install progress filtering
    • Package manager command matching
    • Build output error preservation
    • Build command matching
    • Filter indicator formatting
    • Built-in rule coverage
  • All 20 new tests pass

  • Existing OutputInterceptor tests (25) still pass

  • All monorepo lint and type-check tasks pass

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes.
  • Documentation Impact: No user-facing documentation updates required beyond the settings UI.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Documentation Updates

  • No external documentation updates are required. The new setting is self-documented in the Terminal Settings UI with a label and description.

Additional Notes

This is Phase 1 of the implementation plan outlined in #11459. Phase 2 (user-defined custom rules, per-session toggle, more filter patterns) can follow in a future PR.

The filter only affects what the LLM sees in its context window -- the UI display and full persisted output are unchanged. Users can always toggle the feature off via the new "Smart output filtering" checkbox in Terminal Settings.

Introduces a TerminalOutputFilter module with command-aware output
filtering. Built-in filters for test runners, git, package managers,
and build tools strip noise while preserving actionable information.

New setting terminalOutputFilterEnabled (default: true) with toggle
in Terminal Settings UI.

Closes #11459
@roomote
Copy link
Contributor Author

roomote bot commented Feb 14, 2026

Rooviewer Clock   See task

Reviewed the terminal output filter implementation. Found 2 issues in TerminalOutputFilter.ts that should be addressed before merging:

  • Jest failure pattern is dead code -- isTestFailureLine receives trimmed input, but the regex requires leading whitespace (^(\s+●\s)), so it never matches. Jest failure block headers are silently dropped from filtered output.
  • git status long format misclassifies staged files as unstaged -- The long-format parser always pushes to unstaged[] without tracking the current section header. Since git status defaults to long format, staged files get misreported to the LLM.

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.


function isTestFailureLine(line: string): boolean {
if (/^(FAIL|✕|✗|×|✘)\s/.test(line)) return true
if (/^(\s+●\s)/.test(line)) return true // Jest failure indicator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isTestFailureLine is always called with the trimmed string (line 67), but this regex ^(\s+●\s) requires leading whitespace to match. A trimmed string will never have leading whitespace, so this pattern is dead code. Jest failure blocks like ● TestSuite > test name won't be detected as failure block starts, which means the test name context line and its indented block contents are lost from the filtered output. The fix is to match against ^\u25cf\s instead (no leading whitespace requirement).

Suggested change
if (/^(\s+\s)/.test(line)) return true // Jest failure indicator
if (/^\u25cf\s/.test(line)) return true // Jest failure indicator

Fix it with Roo Code or mention @roomote and request a fix.

Comment on lines +163 to +167
// Long format patterns
if (/^\s*(modified|new file|deleted|renamed|copied):\s+/.test(trimmed)) {
// Determine context from previous section headers
unstaged.push(trimmed)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long-format parser always pushes to unstaged[] regardless of which section the line appears under. git status without flags (the default) uses long format with distinct sections: "Changes to be committed:" (staged) vs "Changes not staged for commit:" (unstaged). Since the code doesn't track which section header was last seen, staged files in long-format output get misreported as unstaged to the LLM. This could cause the model to issue incorrect git operations (e.g., re-staging files that are already staged). Consider tracking a currentSection variable by matching section header lines like /Changes to be committed/ and /Changes not staged/.

Fix it with Roo Code or mention @roomote and request a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] I saved 10M tokens (89%) on my Claude Code sessions with a CLI proxy

1 participant