Skip to content

feat(cli): support XLSX text extraction in read tool#10740

Merged
marius-kilocode merged 3 commits into
mainfrom
feature-read-xlsx-extraction
May 29, 2026
Merged

feat(cli): support XLSX text extraction in read tool#10740
marius-kilocode merged 3 commits into
mainfrom
feature-read-xlsx-extraction

Conversation

@marius-kilocode

@marius-kilocode marius-kilocode commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Spreadsheet files are still classified as binary by the read tool, so agents cannot inspect workbook contents without an external conversion step.

This adds .xlsx extraction for explicit read calls only. Visible worksheets are surfaced as labelled tab-separated text with readable formulas, formatted values, dates, hyperlinks, and errors. Hidden sheets are omitted, workbook inputs above 50 MB are rejected before parsing, worksheet extraction is bounded, and the existing read output limits continue to apply. Native PDF and image attachment behavior, along with rejection of unsupported binary spreadsheet formats, stays unchanged.

With notebook and DOCX reads now present in main, format selection is consolidated behind a Kilo-owned read extraction router rather than adding another format-specific control-flow branch to the shared read tool. The existing notebook and DOCX extractors continue to supply their behavior through the same narrow hook.

The parser uses official SheetJS CE 0.20.3 from its pinned distribution tarball because the public npm registry release is outdated and affected by known vulnerabilities. It adds no transitive runtime dependencies. Compared with current main, the current-platform compiled CLI artifact increases from 105,737,122 bytes to 106,678,306 bytes, an increase of 941,184 bytes (approximately 0.90 MiB).

@marius-kilocode marius-kilocode enabled auto-merge May 29, 2026 14:50
Comment thread packages/opencode/src/kilocode/tool/xlsx.ts Outdated
Comment thread packages/opencode/src/kilocode/tool/xlsx.ts
Comment thread packages/opencode/src/tool/read.ts Outdated
@kilo-code-bot

kilo-code-bot Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Code Review Summary

Status: 3 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 2
Issue Details (click to expand)

WARNING

File Line Issue
packages/opencode/src/kilocode/tool/xlsx.ts 12 No file-size guard before reading entire XLSX into memory — a large workbook bypasses the 50 KB cap that exists for text files

SUGGESTION

File Line Issue
packages/opencode/src/kilocode/tool/xlsx.ts 65 [...rows.entries()].sort(...) allocates all sorted row tuples before iterating; minor but scales to 50k rows
packages/opencode/src/tool/read.ts 308 readLines drains the full XLSX generator even after the line-limit is hit (to count lines), wasting work for sheets near ROW_LIMIT
Other Observations (not in diff)
  • The isBinaryFile switch still has case ".xlsx" (line 129 of read.ts), which is now dead code for the XLSX path since !xlsx && isBinaryFile(...) short-circuits it. Harmless, but could be confusing.
  • readSample (up to 4 KB) is still fetched for XLSX files at line 278 even though the sample is only used for MIME sniffing and binary detection — both of which are bypassed for XLSX. Minor unnecessary I/O.
Files Reviewed (5 files)
  • packages/opencode/src/kilocode/tool/xlsx.ts - 2 issues
  • packages/opencode/src/tool/read.ts - 1 issue
  • packages/opencode/test/kilocode/read-xlsx.test.ts - clean
  • .changeset/read-xlsx-spreadsheets.md - clean
  • packages/opencode/package.json - clean

Fix these issues in Kilo Cloud


Reviewed by claude-4.6-sonnet-20260217 · 1,208,954 tokens

Review guidance: REVIEW.md from base branch main

@marius-kilocode marius-kilocode disabled auto-merge May 29, 2026 14:55
@marius-kilocode marius-kilocode enabled auto-merge May 29, 2026 15:00
…raction

# Conflicts:
#	packages/opencode/src/tool/read.ts
@marius-kilocode marius-kilocode disabled auto-merge May 29, 2026 15:35
@marius-kilocode marius-kilocode merged commit 979cb7e into main May 29, 2026
17 checks passed
@marius-kilocode marius-kilocode deleted the feature-read-xlsx-extraction branch May 29, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants