Skip to content

perf(WinFSP): Phase 33 Windows Async FilePointer Resolution#389

Merged
FSM1 merged 16 commits into
mainfrom
phase-33-windows-async-filepointer-resolution
Mar 29, 2026
Merged

perf(WinFSP): Phase 33 Windows Async FilePointer Resolution#389
FSM1 merged 16 commits into
mainfrom
phase-33-windows-async-filepointer-resolution

Conversation

@FSM1

@FSM1 FSM1 commented Mar 28, 2026

Copy link
Copy Markdown
Owner

Summary

Phase 33: Windows Async FilePointer Resolution
Goal: WinFsp FilePointer resolution no longer blocks the filesystem thread, eliminating Explorer hangs during metadata refresh on Windows
Status: Verified ✓ (9/9 must-haves + runtime verified on Windows 11 + WinFsp)

Replaces the blocking O(N × NETWORK_TIMEOUT) FilePointer resolution loop with async task spawning via tokio. Each unresolved FilePointer now resolves in a background task with 3-retry exponential backoff, while WinFsp callbacks drain completed results on entry. If a read hits an in-flight resolution, it polls for up to 5s before returning STATUS_DEVICE_NOT_READY (0xC00000A3) — a transient NTSTATUS that Explorer retries automatically.

Changes

Plan 33-01: Async FilePointer Resolution Infrastructure

Shared infrastructure used by both macOS and Windows platforms.

Key files modified:

  • crates/fuse/src/lib.rsPendingFilePointer struct, resolve_single_file_pointer async fn (3-retry exponential backoff), drain_file_pointer_completions() method, non-blocking drain_refresh_completions() with async task spawning, dedup guard via resolving_file_pointers HashSet
  • apps/desktop/src-tauri/src/fuse/windows/mod.rs — Windows constructor channel initialization
  • apps/desktop/src-tauri/src/fuse/mod.rs — macOS constructor channel initialization

Plan 33-02: Windows WinFsp Callback Wiring

Wires the async resolution into Windows WinFsp callback entry points.

Key files modified:

  • crates/fuse/src/platform/windows/read_ops.rsdrain_file_pointer_completions() in handle_open and handle_read, FilePointer resolution poll loop with 5s timeout, status_device_not_ready return on timeout
  • crates/fuse/src/platform/windows/dir_ops.rsdrain_file_pointer_completions() in handle_read_directory
  • crates/fuse/src/platform/windows/operations.rsstatus_device_not_ready() NTSTATUS helper (0xC00000A3)

Verification

  • Automated verification: 9/9 must-haves verified (code-level)
  • cargo test -p cipherbox-fuse --features winfsp: 24/24 tests passed
  • cargo check -p cipherbox-desktop --features winfsp: compiled (0 errors)
  • Runtime verification on Windows 11 + WinFsp: desktop app mounted, 9/9 E2E FUSE tests passed
  • Code review via /simplify: collapsed redundant pattern match, added ipns_name to drain logs

Key Decisions

  • STATUS_DEVICE_NOT_READY (0xC00000A3) chosen over STATUS_IO_DEVICE_ERROR — Explorer treats it as transient and retries automatically
  • Scoped FilePointer resolution via get_unresolved_file_pointers_for_parent(refresh.ino) to avoid wrong-folder-key decryption
  • 100ms poll interval with mutex drop/reacquire for cooperative waiting (matches existing content prefetch pattern)
  • Exponential backoff: 500ms base × 2^attempt (1s, 2s, 4s delays) with 3 retries

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Implemented asynchronous FilePointer resolution on Windows with read-time polling support, enabling files to be read while metadata is being resolved in the background.
  • Bug Fixes

    • Eliminated blocking behavior during FilePointer resolution, improving Windows file system responsiveness.
  • Documentation

    • Added comprehensive planning and validation documentation for Phase 33 Windows async resolution infrastructure.

FSM1 and others added 12 commits March 28, 2026 21:23
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add PendingFilePointer struct for channel-based async results
- Add resolve_single_file_pointer async fn with 3-retry exponential backoff
- Add file_pointer_tx/rx channel and resolving_file_pointers dedup guard to CipherBoxFS
- Add drain_file_pointer_completions() method to consume resolved results
- Replace blocking block_with_timeout() loop in drain_refresh_completions() with async task spawning
- Scope FilePointer resolution to parent folder via get_unresolved_file_pointers_for_parent()
- Re-export PendingFilePointer in desktop fuse module

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…constructors

- Add file_pointer_tx/rx channel pair creation in Windows mount_filesystem()
- Add file_pointer_tx/rx channel pair creation in macOS mount_filesystem()
- Initialize resolving_file_pointers HashSet in both CipherBoxFS constructors
- Both cargo check -p cipherbox-fuse and cargo check -p cipherbox-desktop pass

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add 33-01-SUMMARY.md with execution results
- Update STATE.md with plan progress and decisions
- Update ROADMAP.md with plan 01 completion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lepointer-resolution

# Conflicts:
#	.planning/ROADMAP.md
…llbacks

- Add status_device_not_ready() NTSTATUS helper (0xC00000A3) to operations.rs
- Add drain_file_pointer_completions() calls to handle_open, handle_read, handle_read_directory
- Add read-while-resolving poll loop in handle_read with 5s timeout for in-flight FilePointer resolution
- Return STATUS_DEVICE_NOT_READY on poll timeout (Explorer retries automatically)
- Use mutable cid/key/iv/mode locals for in-place update after resolution completes
- Add 33-02-SUMMARY.md with execution results
- Update STATE.md with metrics and progress
- Update ROADMAP.md with phase 33 completion
… drain logs

Collapse redundant double InodeKind::File pattern match in FilePointer
resolution poll loop into a single match with all fields extracted at once.
Use the previously-unused ipns_name field in drain_file_pointer_completions
log messages for better debuggability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Mar 28, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b09bb5c0-dd55-4fbd-b03f-0295933d9a34

📥 Commits

Reviewing files that changed from the base of the PR and between 8cddb05 and 7fdfc11.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • .planning/PROJECT.md
  • .planning/ROADMAP.md
  • .planning/STATE.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-01-PLAN.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-01-SUMMARY.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-02-PLAN.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-02-SUMMARY.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-RESEARCH.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-VALIDATION.md
  • .planning/phases/33-windows-async-filepointer-resolution/33-VERIFICATION.md
  • crates/fuse/src/platform/windows/operations.rs
  • crates/fuse/src/platform/windows/read_ops.rs

Walkthrough

This PR documents and implements Phase 33 (Windows Async FilePointer Resolution), completing project planning documents while adding Windows-specific async FilePointer resolution with polling-based retry behavior in read operations.

Changes

Cohort / File(s) Summary
Project Tracking
.planning/PROJECT.md, .planning/ROADMAP.md, .planning/STATE.md
Updated project milestone and progress tracking, transitioning Phase 33 status to complete and advancing overall progress counters (17/70 to 18/72 phases/plans). Added Phase 33-specific context notes on FilePointer resolution behavior.
Phase 33 Planning Documentation
.planning/phases/33-windows-async-filepointer-resolution/33-*.md
Added comprehensive phase documentation including research (design rationale), two execution plans (shared async infrastructure and Windows WinFsp integration), summaries, validation strategy, and verification report, documenting async FilePointer resolution architecture and Windows callback wiring.
Windows Operations Helper
crates/fuse/src/platform/windows/operations.rs
Added new public status helper function status_device_not_ready() returning NTSTATUS 0xC00000A3 for transient retry signaling.
Windows Read Handler
crates/fuse/src/platform/windows/read_ops.rs
Modified handle_read() to support read-while-resolving: introduced mutable inode metadata bindings, added polling logic (up to 5 seconds) that drains FilePointer completions when an inode's cid is empty and resolution is in-flight, updates resolved metadata upon completion, or returns status_device_not_ready() on timeout.

Sequence Diagram(s)

sequenceDiagram
    participant Client as WinFsp Client
    participant Handler as handle_read()
    participant Drain as drain_file_pointer_completions()
    participant Tasks as Async Resolution<br/>Tasks
    participant Inode as Inode Table

    Client->>Handler: read() with unresolved cid
    Handler->>Inode: check metadata (cid empty,<br/>resolution in-flight)
    alt Resolution in progress
        Handler->>Handler: start 5s polling loop
        loop Every ~100ms iteration
            Handler->>Drain: drain pending completions
            Drain->>Tasks: receive resolved FilePointers
            Tasks-->>Inode: update file_meta_resolved
            Handler->>Inode: check file_meta_resolved
            alt Resolved
                Inode-->>Handler: return resolved cid/keys
                Handler->>Handler: update mutable bindings
                Handler->>Handler: break loop
            else Still resolving
                Handler->>Handler: continue polling
            end
        end
        alt Timeout elapsed
            Handler-->>Client: return status_device_not_ready()
        else Resolution completed
            Handler->>Handler: decrypt & read with<br/>resolved metadata
            Handler-->>Client: return data
        end
    else No in-flight resolution
        Handler->>Handler: proceed with normal flow
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~18 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: implementing Phase 33 Windows Async FilePointer Resolution with a performance focus. It is concise, clear, and directly related to the primary objective documented in the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch phase-33-windows-async-filepointer-resolution

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@FSM1 FSM1 changed the title feat: Phase 33 Windows Async FilePointer Resolution perf(winSMB): Phase 33 Windows Async FilePointer Resolution Mar 28, 2026
@FSM1 FSM1 changed the title perf(winSMB): Phase 33 Windows Async FilePointer Resolution perf(WinFSP): Phase 33 Windows Async FilePointer Resolution Mar 28, 2026
@FSM1 FSM1 requested a review from Copilot March 28, 2026 22:47
@FSM1

FSM1 commented Mar 28, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai please review

@coderabbitai

coderabbitai Bot commented Mar 28, 2026

Copy link
Copy Markdown

@FSM1, I'll start a review of PR #389 right away!

A couple of notes based on long-term memory going into this review:

  • Learning superseded: The learning (671b7989) that flagged the synchronous FilePointer resolution loop in drain_refresh_completions() as intentional pre-existing design (deferred restructuring) is now being addressed by this PR itself — this is the right place to do it.
  • Vec<u8> key copies (d5f3bd14): If resolve_single_file_pointer passes folder_key/ipns_private_key into spawned tokio tasks, I'll check whether those copies remain short-lived as per the existing deferral rationale.

🧠 Learnings used
Learnt from: FSM1
Repo: FSM1/cipher-box PR: 352
File: crates/fuse/src/lib.rs:617-642
Timestamp: 2026-03-24T19:11:36.150Z
Learning: In `crates/fuse/src/lib.rs` (previously `apps/desktop/src-tauri/src/fuse/mod.rs`) `drain_refresh_completions()`: The synchronous file pointer resolution loop (calling `resolve_ipns`/`fetch_content` via `block_with_timeout`) inside `drain_refresh_completions()` is intentional pre-existing design. It runs during `readdir()`, which is already a blocking FUSE callback. Each resolution is bounded by `NETWORK_TIMEOUT` (currently 10s in cipherbox-fuse, 3s in the old operations.rs) and only triggers for newly-discovered FilePointer inodes. Restructuring this to use a background channel/async path would require redesigning the entire FUSE callback model and is deferred. Do not flag this as a blocking issue in future PRs.

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Windows FUSE implementation must use WinFSP or Dokan as filesystem driver API (fuser not used). Use file IDs instead of inodes with same stability requirement. Filter platform special files: `desktop.ini`, `Thumbs.db`, `$RECYCLE.BIN`, Zone.Identifier ADS. Use Windows Credential Manager via `keyring` crate

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 253
File: tests/e2e-desktop/scripts/test-conflict-detection.ps1:156-157
Timestamp: 2026-03-03T14:55:05.884Z
Learning: In `tests/e2e-desktop/scripts/*.{sh,ps1}` (FUSE E2E tests): Fixed sleep-based waits after file mutations are an inherent limitation of FUSE-mount testing. The FUSE daemon provides no IPC signal or status endpoint to indicate when a debounced IPNS metadata publish completes. Tests use conservative fixed waits (e.g., 8s for initial publish accounting for 1.5s debounce + 10s safety valve, 15s for conflict resolution) and can be increased if CI proves flaky. Eliminating sleep-based waits would require exposing an IPC readiness signal from the FUSE daemon.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 249
File: apps/desktop/src-tauri/src/fuse/read_ops.rs:476-501
Timestamp: 2026-03-02T19:49:35.507Z
Learning: In `apps/desktop/src-tauri/src/fuse/read_ops.rs` (macOS FUSE): The `handle_read` function includes a deliberate 3-second poll loop waiting for async prefetched content via the mpsc channel before falling back to EIO. This design choice is specific to the FUSE-T SMB backend to avoid surfacing error dialogs to users on cache misses, as SMB will retry on EIO. The prefetch is already fired in `handle_open`, so the brief blocking wait allows it to complete without direct network I/O in the read callback.

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Cross-platform FUSE architecture must include platform-agnostic data structures (`InodeTable`, `MetadataCache`, `ContentCache`), channel-based async prefetch pattern, debounced publish queue with per-folder coalescing, `PublishCoordinator` for IPNS sequence management, and crypto functions (`encrypt_metadata_to_json()`, `decrypt_metadata_from_ipfs_public()`) with no OS dependencies

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 352
File: crates/fuse/src/lib.rs:235-243
Timestamp: 2026-03-24T19:11:15.856Z
Learning: In `crates/fuse/src/lib.rs` `spawn_metadata_publish()` and `build_folder_metadata()`: The `folder_key` and `ipns_private_key` parameters are plain `Vec<u8>` rather than `Zeroizing<Vec<u8>>`. This is intentional for v1 (Phase 23 PR `#352`) — the copies are short-lived (<2s) on a background thread, and the primary zeroization defense is on the source `InodeKind` fields. Wrapping publish-path copies in `Zeroizing` would touch 15+ call sites across all platforms and is deferred to a follow-up PR. Do not re-flag this as a regression.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 249
File: apps/desktop/src-tauri/src/fuse/read_ops.rs:237-249
Timestamp: 2026-03-02T19:49:20.690Z
Learning: In apps/desktop/src-tauri/src/fuse/read_ops.rs (and previously operations.rs): The write-open path in `handle_open` intentionally uses synchronous `fetch_and_decrypt_file_content()` as a fallback after cache-check, because `OpenFileHandle::new_write()` needs existing file content at creation time to support append/modify operations. This is a documented design tradeoff in CLAUDE.md. Prefetch+cache-check happens first; synchronous fallback only triggers on cache miss.

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Applies to apps/desktop/src-tauri/src/fuse/*.rs : ALL FUSE callbacks must run on a single thread. Any blocking call stalls the entire filesystem. Background tasks must communicate via `mpsc` channels, with channels drained in `readdir()`

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Applies to apps/desktop/src-tauri/src/fuse/*.rs : FUSE callbacks must never perform network I/O except in `release()` which spawns a background task. `open()` fires async prefetch; `read()` checks cache and returns EIO on miss. `write()` writes to local temp file; encrypt+upload happens on `release()`

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 253
File: apps/desktop/src-tauri/src/fuse/mod.rs:359-359
Timestamp: 2026-03-03T15:26:30.522Z
Learning: In `apps/desktop/src-tauri/src/fuse/mod.rs` `spawn_metadata_publish()`: Orphaned metadata CIDs (< 1KB JSON payloads) from failed IPNS publish attempts are acceptable for v1. On non-409 publish errors, the function returns early without unpinning the uploaded metadata CID. The debounced publish queue retries with fresh metadata, and Pinata's garbage collection eventually unpins orphaned CIDs. Explicit cleanup on every error path was deferred to v2 as the complexity vs. benefit tradeoff is not favorable for small payloads.
<!-- [add_learning]
For the cipher-box desktop FUSE implementation: The project philosophy accepts small orphaned IPFS CIDs (metadata payloads < 1KB) in error paths when adding explicit cleanup would significantly increase code complexity. Pinata's garbage collection provides the backstop for eventual cleanup.
</learning>
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements Phase 33’s goal of preventing Windows Explorer hangs by moving FilePointer resolution off the WinFsp filesystem callback thread, using tokio-spawned background tasks and draining completions on subsequent callback entry.

Changes:

  • Replaces the blocking FilePointer resolution loop in drain_refresh_completions() with async task spawning + 3-retry exponential backoff, delivering results via an mpsc channel.
  • Wires Windows WinFsp callbacks (open/read/readdir) to drain FilePointer completions; adds a bounded read-while-resolving poll that returns STATUS_DEVICE_NOT_READY after 5s.
  • Updates desktop mount constructors to initialize the new FilePointer channel + dedup state; updates planning/state docs and workspace lockfile.

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/fuse/src/lib.rs Adds PendingFilePointer channel + dedup set; async resolver with retry/backoff; spawns per-FilePointer tasks during refresh; drains completions into inode table.
crates/fuse/src/platform/windows/read_ops.rs Drains FilePointer completions in open/read; polls up to 5s for in-flight resolution before returning STATUS_DEVICE_NOT_READY.
crates/fuse/src/platform/windows/dir_ops.rs Drains FilePointer completions on read_directory entry.
crates/fuse/src/platform/windows/operations.rs Adds status_device_not_ready() helper (0xC00000A3).
apps/desktop/src-tauri/src/fuse/windows/mod.rs Initializes FilePointer channel + resolving set in Windows FS constructor.
apps/desktop/src-tauri/src/fuse/mod.rs Re-exports PendingFilePointer; initializes FilePointer channel + resolving set in mount setup.
Cargo.lock Bumps internal workspace crate versions to 0.4.0.
.planning/phases/33-windows-async-filepointer-resolution/* Adds Phase 33 research/plan/summary/verification/validation artifacts.
.planning/STATE.md / .planning/ROADMAP.md / .planning/PROJECT.md Marks Phase 33 complete and updates project state metadata.

Comment thread crates/fuse/src/lib.rs Outdated
Comment thread crates/fuse/src/lib.rs Outdated
FSM1 and others added 2 commits March 29, 2026 00:01
- Lower FilePointer resolution success log from info to debug to avoid
  log spam in folders with many files (Copilot review thread #2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 32 (main) introduced shared async FilePointer infrastructure with
different naming (filepointer_rx/tx, drain_filepointer_completions, enum
PendingFilePointer) and additional features (concurrency cap, timeout).
Phase 33's Windows-specific code adapted to use Phase 32's conventions.

Resolved:
- PendingFilePointer: keep Phase 32's enum (Success/Failure variants)
- Field names: filepointer_rx/tx (Phase 32) not file_pointer_rx/tx (Phase 33)
- Drain function: drain_filepointer_completions (Phase 32 naming)
- Spawn logic: Phase 32's concurrency cap + timeout + better error handling
- Removed Phase 33's resolve_single_file_pointer (superseded by Phase 32 inline)
- Planning docs: merged both phases' completion status

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.

Comment thread crates/fuse/src/platform/windows/read_ops.rs
@codecov

codecov Bot commented Mar 28, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.17%. Comparing base (8cddb05) to head (7fdfc11).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main     #389       +/-   ##
===========================================
+ Coverage   61.15%   71.17%   +10.01%     
===========================================
  Files         132      111       -21     
  Lines        9750     6783     -2967     
  Branches      976      976               
===========================================
- Hits         5963     4828     -1135     
+ Misses       3573     1741     -1832     
  Partials      214      214               
Flag Coverage Δ
desktop 14.95% <ø> (-16.27%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 34 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

If async resolution fails quickly, drain_filepointer_completions removes
the ino from resolving_file_pointers but the poll loop previously kept
sleeping until the 5s timeout. Now breaks immediately when the ino is no
longer in-flight, falling through to the existing empty-cid handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@FSM1

FSM1 commented Mar 29, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai please review

@coderabbitai

coderabbitai Bot commented Mar 29, 2026

Copy link
Copy Markdown

@FSM1, kicking off a review of PR #389 now!

One long-term memory housekeeping note: learning 671b7989 (which flagged the synchronous drain_refresh_completions() loop as intentional pre-existing design and asked not to flag it in future PRs) is now superseded by this PR itself — the async restructuring it deferred has been implemented here. Removing it to avoid false suppression in any future PRs that touch this area.


⛔ Learnings removed
Learnt from: FSM1
Repo: FSM1/cipher-box PR: 352
File: crates/fuse/src/lib.rs:617-642
Timestamp: 2026-03-24T19:11:36.150Z
Learning: In `crates/fuse/src/lib.rs` (previously `apps/desktop/src-tauri/src/fuse/mod.rs`) `drain_refresh_completions()`: The synchronous file pointer resolution loop (calling `resolve_ipns`/`fetch_content` via `block_with_timeout`) inside `drain_refresh_completions()` is intentional pre-existing design. It runs during `readdir()`, which is already a blocking FUSE callback. Each resolution is bounded by `NETWORK_TIMEOUT` (currently 10s in cipherbox-fuse, 3s in the old operations.rs) and only triggers for newly-discovered FilePointer inodes. Restructuring this to use a background channel/async path would require redesigning the entire FUSE callback model and is deferred. Do not flag this as a blocking issue in future PRs.

🧠 Learnings used
Learnt from: FSM1
Repo: FSM1/cipher-box PR: 352
File: crates/fuse/src/lib.rs:617-642
Timestamp: 2026-03-24T19:11:36.150Z
Learning: In `crates/fuse/src/lib.rs` (previously `apps/desktop/src-tauri/src/fuse/mod.rs`) `drain_refresh_completions()`: The synchronous file pointer resolution loop (calling `resolve_ipns`/`fetch_content` via `block_with_timeout`) inside `drain_refresh_completions()` is intentional pre-existing design. It runs during `readdir()`, which is already a blocking FUSE callback. Each resolution is bounded by `NETWORK_TIMEOUT` (currently 10s in cipherbox-fuse, 3s in the old operations.rs) and only triggers for newly-discovered FilePointer inodes. Restructuring this to use a background channel/async path would require redesigning the entire FUSE callback model and is deferred. Do not flag this as a blocking issue in future PRs.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 352
File: crates/fuse/src/lib.rs:235-243
Timestamp: 2026-03-24T19:11:15.856Z
Learning: In `crates/fuse/src/lib.rs` `spawn_metadata_publish()` and `build_folder_metadata()`: The `folder_key` and `ipns_private_key` parameters are plain `Vec<u8>` rather than `Zeroizing<Vec<u8>>`. This is intentional for v1 (Phase 23 PR `#352`) — the copies are short-lived (<2s) on a background thread, and the primary zeroization defense is on the source `InodeKind` fields. Wrapping publish-path copies in `Zeroizing` would touch 15+ call sites across all platforms and is deferred to a follow-up PR. Do not re-flag this as a regression.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 253
File: tests/e2e-desktop/scripts/test-conflict-detection.ps1:156-157
Timestamp: 2026-03-03T14:55:05.884Z
Learning: In `tests/e2e-desktop/scripts/*.{sh,ps1}` (FUSE E2E tests): Fixed sleep-based waits after file mutations are an inherent limitation of FUSE-mount testing. The FUSE daemon provides no IPC signal or status endpoint to indicate when a debounced IPNS metadata publish completes. Tests use conservative fixed waits (e.g., 8s for initial publish accounting for 1.5s debounce + 10s safety valve, 15s for conflict resolution) and can be increased if CI proves flaky. Eliminating sleep-based waits would require exposing an IPC readiness signal from the FUSE daemon.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 253
File: apps/desktop/src-tauri/src/fuse/mod.rs:359-359
Timestamp: 2026-03-03T15:26:30.522Z
Learning: In `apps/desktop/src-tauri/src/fuse/mod.rs` `spawn_metadata_publish()`: Orphaned metadata CIDs (< 1KB JSON payloads) from failed IPNS publish attempts are acceptable for v1. On non-409 publish errors, the function returns early without unpinning the uploaded metadata CID. The debounced publish queue retries with fresh metadata, and Pinata's garbage collection eventually unpins orphaned CIDs. Explicit cleanup on every error path was deferred to v2 as the complexity vs. benefit tradeoff is not favorable for small payloads.
<!-- [add_learning]
For the cipher-box desktop FUSE implementation: The project philosophy accepts small orphaned IPFS CIDs (metadata payloads < 1KB) in error paths when adding explicit cleanup would significantly increase code complexity. Pinata's garbage collection provides the backstop for eventual cleanup.
</learning>

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 262
File: apps/desktop/src-tauri/src/fuse/mod.rs:544-585
Timestamp: 2026-03-04T03:09:12.729Z
Learning: In `apps/desktop/src-tauri/src/fuse/mod.rs` `spawn_bin_entry_publish()`: The bin IPNS publish conflict handling (line ~574) is intentionally fire-and-forget. On a publish conflict, the new bin entry is dropped from the IPNS record but the deleted file/folder CID remains pinned (no data loss). A retry-with-merge strategy (similar to `spawn_metadata_publish`) is a known enhancement deferred to v2 due to engineering complexity for a background non-blocking path. The web app bin UI is expected to be able to reconstruct state from pinned CIDs if needed.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 342
File: docker/docker-compose.staging.yml:66-69
Timestamp: 2026-03-23T23:48:45.109Z
Learning: In FSM1/cipher-box, the `ipfs_staging` volume migration from flatfs to pebbleds (introduced in PR `#342`, `IPFS_PROFILE=server,pebbleds`) is intentionally a manual operation — NOT automated in the deploy workflow. The cutover plan is: spin up a new pebbleds Kubo container alongside the existing flatfs one, validate pinned CIDs against the database, re-pin only active CIDs to the new node, then swap. The inline comment in `docker/docker-compose.staging.yml` (around the `IPFS_PROFILE` env var) documents the volume reset requirement for operators. Do not flag the missing `down -v` step in the deploy workflow as a regression.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 296
File: packages/sdk/src/client.ts:286-360
Timestamp: 2026-03-20T23:50:55.576Z
Learning: In `packages/sdk/src/client.ts` (`CipherBoxClient`): Concurrent mutations to the same folder (e.g., `createFolder`, `renameItem`, `moveItem`, `deleteItem`, `deleteToBin`) are intentionally not serialized with a per-folder mutex in the SDK. Concurrent safety is provided at the UI layer — buttons/actions are disabled while an operation is in flight. A keyed async mutex/queue per folder IPNS name (and a separate one for bin state) is a known deferred improvement and should not be re-flagged as a regression in this or future PRs (first flagged in PR `#296`).

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Cross-platform FUSE architecture must include platform-agnostic data structures (`InodeTable`, `MetadataCache`, `ContentCache`), channel-based async prefetch pattern, debounced publish queue with per-folder coalescing, `PublishCoordinator` for IPNS sequence management, and crypto functions (`encrypt_metadata_to_json()`, `decrypt_metadata_from_ipfs_public()`) with no OS dependencies

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Applies to apps/desktop/src-tauri/vendor/fuser/src/channel.rs : The vendored fuser crate at `src-tauri/vendor/fuser/` must include a critical patch to `channel.rs:receive()` to handle FUSE-T Unix domain socket fragmentation for messages >256KB. Use `recv(MSG_PEEK)` to read the FUSE header length, then loop-read exactly that many bytes

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Applies to apps/desktop/src-tauri/src/fuse/*.rs : ALL FUSE callbacks must run on a single thread. Any blocking call stalls the entire filesystem. Background tasks must communicate via `mpsc` channels, with channels drained in `readdir()`

Learnt from: CR
Repo: FSM1/cipher-box PR: 0
File: apps/desktop/CLAUDE.md:0-0
Timestamp: 2026-02-21T01:23:36.268Z
Learning: Windows FUSE implementation must use WinFSP or Dokan as filesystem driver API (fuser not used). Use file IDs instead of inodes with same stability requirement. Filter platform special files: `desktop.ini`, `Thumbs.db`, `$RECYCLE.BIN`, Zone.Identifier ADS. Use Windows Credential Manager via `keyring` crate

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 249
File: apps/desktop/src-tauri/src/fuse/read_ops.rs:476-501
Timestamp: 2026-03-02T19:49:35.507Z
Learning: In `apps/desktop/src-tauri/src/fuse/read_ops.rs` (macOS FUSE): The `handle_read` function includes a deliberate 3-second poll loop waiting for async prefetched content via the mpsc channel before falling back to EIO. This design choice is specific to the FUSE-T SMB backend to avoid surfacing error dialogs to users on cache misses, as SMB will retry on EIO. The prefetch is already fired in `handle_open`, so the brief blocking wait allows it to complete without direct network I/O in the read callback.

Learnt from: FSM1
Repo: FSM1/cipher-box PR: 249
File: apps/desktop/src-tauri/src/fuse/read_ops.rs:237-249
Timestamp: 2026-03-02T19:49:20.690Z
Learning: In apps/desktop/src-tauri/src/fuse/read_ops.rs (and previously operations.rs): The write-open path in `handle_open` intentionally uses synchronous `fetch_and_decrypt_file_content()` as a fallback after cache-check, because `OpenFileHandle::new_write()` needs existing file content at creation time to support append/modify operations. This is a documented design tradeoff in CLAUDE.md. Prefetch+cache-check happens first; synchronous fallback only triggers on cache miss.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@FSM1 FSM1 merged commit b2f6572 into main Mar 29, 2026
26 checks passed
@FSM1 FSM1 deleted the phase-33-windows-async-filepointer-resolution branch March 29, 2026 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants