fix(file-entry-cache): restore v8 reconcile and change-detection semantics (#1648)#1649
fix(file-entry-cache): restore v8 reconcile and change-detection semantics (#1648)#1649jaredwray wants to merge 3 commits into
Conversation
…ntics (#1648) file-entry-cache v11 diverged from v8 in three ways that broke ESLint's use case: 1. reconcile() updated cache entries for *every* tracked file, not just those inspected via getFileDescriptor(). This revalidated files that were never looked at this run, so a file that changed on disk but was not part of the run kept its stale (valid) cache entry. reconcile() now only updates files that were visited during the session and drops ones that were deleted. 2. getFileDescriptor() returned `changed: false` on repeated calls for the same file because it overwrote the persisted meta on every read. Change detection now compares against a baseline snapshot taken as of the last load/reconcile, so a file keeps reporting `changed: true` until the cache is reconciled. 3. create() threw when the cache file contained invalid JSON. It now starts with an empty cache and overwrites the file on the next reconcile(), matching v8. Adds regression tests covering all three scenarios and updates existing tests that relied on the previous (incorrect) same-session dedup behavior to reconcile before re-checking.
There was a problem hiding this comment.
Code Review
This pull request addresses issue #1648 by introducing a baseline snapshot (_originalMeta) to track the persisted state of files as of the last load or reconcile. This ensures that repeated getFileDescriptor() calls consistently report a file as changed until reconcile() is executed, and that reconcile() only updates files visited during the current session. Additionally, it adds error handling to prevent throwing when loading invalid or corrupted cache files. The reviewer identified a correctness and performance issue in the reconcile() method, where updating meta.size and meta.mtime using fs.statSync can lead to silent cache invalidation and inconsistent cache states when checksums are used, and provided a code suggestion to only check for file existence.
| try { | ||
| const absolutePath = this.getAbsolutePath(key); | ||
| const fstat = fs.statSync(absolutePath); | ||
| // Refresh the stats to reflect the state at reconcile time. | ||
| meta.size = fstat.size; | ||
| meta.mtime = fstat.mtime.getTime(); | ||
| this._cache.setKey(key, meta); | ||
| // Update the baseline so subsequent checks see the file as unchanged. | ||
| this._originalMeta.set(key, { ...meta }); | ||
| } catch { | ||
| // The file no longer exists; drop it from the cache. | ||
| this._cache.removeKey(key); | ||
| this._originalMeta.delete(key); | ||
| } |
There was a problem hiding this comment.
Correctness & Performance Issue: Redundant and Buggy Re-stating in reconcile()
Updating meta.size and meta.mtime during reconcile() introduces a severe correctness bug and a performance bottleneck:
- Silent Cache Invalidation (Correctness Bug): If a file is modified after
getFileDescriptor()is called but beforereconcile(),reconcile()will overwrite the cachedmtimeandsizewith the new values. In the next session,getFileDescriptor()will compare the disk stats with these new cached stats, see they match, and reportchanged: false. The modification will be silently ignored, and the file will never be re-processed/re-linted. - Inconsistent Cache State: If
useCheckSumis enabled,reconcile()updatesmtimeandsizebut does not update thehash(since it doesn't re-calculate the checksum). This leaves the cache entry in an inconsistent state. - Performance Overhead: Calling
fs.statSyncon every visited file duringreconcile()introduces redundant synchronous I/O, which can significantly slow down execution for large codebases.
Solution: Do not update the file stats in reconcile(). Instead, just check if the file still exists (to satisfy the deleted files cleanup test) and update the baseline _originalMeta with the existing meta from the working cache.
| try { | |
| const absolutePath = this.getAbsolutePath(key); | |
| const fstat = fs.statSync(absolutePath); | |
| // Refresh the stats to reflect the state at reconcile time. | |
| meta.size = fstat.size; | |
| meta.mtime = fstat.mtime.getTime(); | |
| this._cache.setKey(key, meta); | |
| // Update the baseline so subsequent checks see the file as unchanged. | |
| this._originalMeta.set(key, { ...meta }); | |
| } catch { | |
| // The file no longer exists; drop it from the cache. | |
| this._cache.removeKey(key); | |
| this._originalMeta.delete(key); | |
| } | |
| try { | |
| const absolutePath = this.getAbsolutePath(key); | |
| if (!fs.existsSync(absolutePath)) { | |
| this._cache.removeKey(key); | |
| this._originalMeta.delete(key); | |
| } else { | |
| // Update the baseline so subsequent checks see the file as unchanged. | |
| this._originalMeta.set(key, { ...meta }); | |
| } | |
| } catch { | |
| // The file no longer exists; drop it from the cache. | |
| this._cache.removeKey(key); | |
| this._originalMeta.delete(key); | |
| } |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1649 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 26 26
Lines 2688 2706 +18
Branches 593 598 +5
=========================================
+ Hits 2688 2706 +18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d1ef69fd6d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| meta.size = fstat.size; | ||
| meta.mtime = fstat.mtime.getTime(); |
There was a problem hiding this comment.
Recompute checksum when refreshing reconcile metadata
When useCheckSum is enabled and a file changes after getFileDescriptor() reads it but before reconcile() runs, this refreshes only size/mtime and then records that mixed metadata as the new baseline. The cached hash remains from the earlier contents, so the next getFileDescriptor() compares the current hash against the stale one and reports changed: true even though reconcile() just updated the baseline; either recompute meta.hash here or avoid refreshing stats independently of the descriptor metadata.
Useful? React with 👍 / 👎.
| const absolutePath = this.getAbsolutePath(key); | ||
| const fstat = fs.statSync(absolutePath); | ||
| // Refresh the stats to reflect the state at reconcile time. | ||
| meta.size = fstat.size; | ||
| meta.mtime = fstat.mtime.getTime(); |
There was a problem hiding this comment.
Preserve descriptor-time stats during reconcile
For callers that do not use checksums, if an inspected file is modified after getFileDescriptor() returns but before reconcile() runs, this re-stats the file and advances the saved baseline to content that was never analyzed. The following run then sees the file as unchanged and can reuse stale results; reconcile() should persist the metadata captured when the descriptor was inspected rather than refreshing size/mtime at save time.
Useful? React with 👍 / 👎.
…d entries Covers the defensive branch in reconcile() that skips a session-tracked key when it no longer exists in the underlying cache, restoring 100% patch coverage.
- create(): only treat parse errors (SyntaxError) as an invalid cache and fall back to an empty cache; re-throw unexpected IO/permission errors so a valid cache is not silently discarded and overwritten. - reconcile(): persist the consistent size/mtime/hash snapshot captured when the file was inspected instead of re-stat'ing only size/mtime (which left the hash stale and could mask a change made between getFileDescriptor() and reconcile() in useModifiedTime mode). statSync is now used only to detect deletion. Adds regression tests for both behaviors.
Summary
Fixes the three behavioral regressions reported in #1648, where
file-entry-cachev11 diverged from v8 in ways that broke ESLint's use case. All three share a root cause:getFileDescriptor()was overwriting the persisted cache meta on every read, andreconcile()was re-validating every tracked entry.Bug 1 —
reconcile()updated cache entries for all filesPreviously
reconcile()looped over every item in the cache and calledgetFileDescriptor()on it, which re-stat'd and refreshed the stored meta. This effectively marked every cached entry as valid, even files that were never inspected during the run (and may have changed on disk). ESLint isn't always run on the same set of files, so this caused stale results to be treated as valid.Fix:
reconcile()now only updates entries for files that were inspected viagetFileDescriptor()during the current session, and drops entries whose files were deleted. Entries that were never looked at keep their previously persisted meta untouched.Bug 2 —
getFileDescriptor()returnedchanged: falseon repeated callsBecause the working cache was updated on every read, a second call for the same file compared against the just-written value and reported
changed: false.Fix: change detection now compares against a baseline snapshot taken as of the last load/reconcile (not the working cache). A file keeps reporting
changed: trueon subsequent calls until the cache is reconciled — matching v8.Bug 3 —
create()threw on invalid cache file contentA corrupted / non-JSON cache file caused
create()(andcreateFromFile()) to throw.Fix: invalid content is now caught; the cache starts empty and the file is overwritten on the next
reconcile(), matching v8's silent-overwrite behavior.Behavioral change note
Fixing bug 2 changes the same-session contract: repeated
getFileDescriptor()calls no longer dedup tochanged: falsewithout an interveningreconcile(). Existing tests that relied on the old behavior were updated toreconcile()before re-checking (which is the correct usage pattern), and the change is the explicitly requested v8 behavior from the issue.Tests
test/issue-1648.test.tswith regression coverage for all three scenarios (includinguseCheckSumandcreateFromFilevariants).test/index.test.tsthat encoded the previous same-session dedup behavior.should return a file descriptor with checksum and error,should return all entries) are pre-existing and unrelated to this change — they depend onchmod 000to simulate a read error, which is bypassed when the test runner executes as root. They fail identically onmainin a root environment and pass under a normal user in CI.Closes #1648
https://claude.ai/code/session_015KvtfccNcUBP7TZ5DZE9x6
Generated by Claude Code