Skip to content

entry: fix fscache stale directory listing in parallel checkout pathspec#917

Draft
tyrielv wants to merge 2 commits into
microsoft:vfs-2.53.0from
tyrielv:tyrielv/fscache-parallel-checkout-fix
Draft

entry: fix fscache stale directory listing in parallel checkout pathspec#917
tyrielv wants to merge 2 commits into
microsoft:vfs-2.53.0from
tyrielv:tyrielv/fscache-parallel-checkout-fix

Conversation

@tyrielv
Copy link
Copy Markdown

@tyrielv tyrielv commented May 15, 2026

Problem

git checkout <tree> -- <pathspec> with checkout.workers > 1 and core.fscache=true (Windows) fails with:

\
fatal: cannot create directory at '...': Directory not empty
\\

when restoring files into directories that do not yet exist on disk.

Root Cause

create_directories() creates parent directories via mkdir(), but the Windows fscache (which caches directory listings) is not invalidated. Subsequent has_dirs_only_path() calls for the same parent directory return stale ENOENT from the cached listing. The recovery path then tries to unlink+recreate the directory, which fails because mkdir() already populated it with child directories.

With workers=1, write_entry() calls flush_fscache() after each file, keeping the cache in sync. With workers>1, enqueue_checkout() defers the write (and the flush), leaving the cache stale for the next entry.

This PR

Adds a regression test (currently expected-failure) that deterministically reproduces the bug. Fix commit incoming.

Bug: https://dev.azure.com/microsoft/OS/_workitems/edit/62260193

When checkout.workers > 1 and core.fscache is enabled on Windows,
'git checkout <tree> -- <pathspec>' fails with 'cannot create directory:
Directory not empty' when restoring files into directories that do not
yet exist on disk.

Root cause: create_directories() creates parent directories via mkdir(),
but the Windows fscache (which caches directory listings) is not
invalidated. Subsequent has_dirs_only_path() calls for the same parent
directory return stale ENOENT from the cached listing. The recovery
path then tries to unlink+recreate the directory, which fails because
mkdir() already populated it with child directories.

With workers=1, write_entry() calls flush_fscache() after each file,
keeping the cache in sync. With workers>1, enqueue_checkout() defers
the write (and the flush), leaving the cache stale for the next entry.

Add a test that reproduces this deterministically: create two files
sharing a nested parent directory, delete them in a second commit, then
restore both via 'git checkout <tree> -- <pathspec>' with workers>1.

Bug: https://dev.azure.com/microsoft/OS/_workitems/edit/62260193

Assisted-by: Claude Opus 4.6
Signed-off-by: Tyrie Vella <tyrielv@gmail.com>
@tyrielv tyrielv force-pushed the tyrielv/fscache-parallel-checkout-fix branch from c4db787 to a3c1aff Compare May 15, 2026 18:24
When checkout.workers > 1 and core.fscache is enabled on Windows,
the fscache caches directory listings that become stale when
create_directories() creates new parent directories via mkdir() or
when write_pc_item() writes new files. Subsequent lstat() calls
through the fscache return ENOENT for these just-created filesystem
entries, causing two failure modes:

1. create_directories(): has_dirs_only_path() reports a just-created
   directory as non-existent, triggering the unlink+mkdir recovery
   path which fails with 'Directory not empty' because the directory
   already has children from earlier mkdir() calls.

2. write_pc_item(): after writing and closing a file, lstat() cannot
   see it through the stale parent directory listing cache, failing
   with 'unable to stat just-written file'.

With workers=1 these do not occur because write_entry() calls
flush_fscache() after each file, keeping the cache in sync. With
workers>1, enqueue_checkout() defers the write (and the flush),
leaving the cache stale for subsequent entries.

Fix both by adding flush_fscache() calls:
- In create_directories() after each successful mkdir()
- In write_pc_item() before lstat() of the just-written file

On non-Windows platforms flush_fscache() is a no-op, so there is
no behavioral change.

Assisted-by: Claude Opus 4.6
Signed-off-by: Tyrie Vella <tyrielv@gmail.com>
@tyrielv tyrielv force-pushed the tyrielv/fscache-parallel-checkout-fix branch from c4db787 to 8b819f9 Compare May 15, 2026 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant