feat: desktop FUSE data-loss bugs and replay hardening#493
Conversation
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Add pub use reply::ReplySender so cipherbox-fuse test code can implement a capturing sender - mod reply stays private; only the trait name is surfaced - channel.rs FUSE-T patch untouched Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- New cfg(all(test, feature = fuse)) test_support module - make_test_fs / make_test_fs_with_keypair build the full CipherBoxFS literal - Per-test isolated journal dir keyed by process id plus monotonic counter - CaptureSender implements fuser::ReplySender; reply_error_code decodes out-header Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- getattr_returns_ok_for_root and flush_returns_ok prove CaptureSender captures the out-header error code - build_upload_journal_entry round-trip asserts ciphertext is journalled and the file key is ECIES-wrapped - build_mkdir_journal_entry round-trip asserts child and parent IPNS keys are ECIES-wrapped hex - builder tests use a real secp256k1 keypair; no raw keys or plaintext enter the journal Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 32 minutes and 40 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughPhase 46 implements four plans for the desktop FUSE subsystem: a test harness built on a vendored ChangesPhase 46: Desktop FUSE Data-Loss & Replay Hardening
Sequence Diagram(s)sequenceDiagram
rect rgba(200, 220, 255, 0.5)
Note over mount_filesystem,fusermount3: Linux stale-mount recovery path
end
participant mount_filesystem
participant recover_stale_mount
participant mountinfo_contains_mountpoint
participant fusermount3
participant create_mount_point_dir
mount_filesystem->>recover_stale_mount: recover_stale_mount(&mount_path)
recover_stale_mount->>mountinfo_contains_mountpoint: parse /proc/self/mountinfo
mountinfo_contains_mountpoint-->>recover_stale_mount: mount found (stale)
recover_stale_mount->>fusermount3: fusermount3 -u <path>
fusermount3-->>recover_stale_mount: error
recover_stale_mount->>fusermount3: fusermount3 -z -u <path>
fusermount3-->>recover_stale_mount: ok
mount_filesystem->>create_mount_point_dir: create_mount_point_dir(&mount_path)
create_mount_point_dir->>create_mount_point_dir: create_dir_all → EEXIST → recover + retry once
create_mount_point_dir-->>mount_filesystem: Ok(())
sequenceDiagram
rect rgba(255, 220, 200, 0.5)
Note over replay_upload_entry,record_failure: Replay hardening paths
end
participant replay_upload_entry
participant resolve_ipns_for_replay
participant resolve_sequence_strict
participant ApiClient
participant record_failure
replay_upload_entry->>replay_upload_entry: file_meta_ipns_name is None?
alt entry has no per-file IPNS name
replay_upload_entry->>record_failure: Err → park entry (legacy guard)
else entry has per-file IPNS name
replay_upload_entry->>resolve_ipns_for_replay: classify outcome
resolve_ipns_for_replay->>resolve_sequence_strict: strict resolve (no cache fallback)
resolve_sequence_strict->>ApiClient: IPNS resolve request
ApiClient-->>resolve_sequence_strict: transient error
resolve_sequence_strict-->>resolve_ipns_for_replay: Err (no cache fallback)
resolve_ipns_for_replay-->>replay_upload_entry: IpnsResolveOutcome::Error
replay_upload_entry->>record_failure: park entry for later retry
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Release Preview
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #493 +/- ##
==========================================
+ Coverage 62.85% 66.83% +3.98%
==========================================
Files 139 155 +16
Lines 10359 16843 +6484
Branches 1125 1127 +2
==========================================
+ Hits 6511 11257 +4746
- Misses 3612 5350 +1738
Partials 236 236
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/desktop/src-tauri/src/fuse/mod.rs (1)
101-124:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRecovered
!existspath skips stale mount-directory cleanupAfter Linux recovery/retry succeeds from the
!mount_path.exists()branch, stale mountpoint contents are not cleaned because cleanup only exists in theelsebranch. This leaves crash residue on the recover path.Proposed fix
- if !mount_path.exists() { + if !mount_path.exists() { // On Linux this recovers once from a stale FUSE mount whose dirent still // exists (surfaces as EEXIST even though `exists()` returned false); on // other platforms it is a plain `create_dir_all`. #[cfg(target_os = "linux")] let create_result = cipherbox_fuse::platform::linux::create_mount_point_dir(&mount_path); #[cfg(not(target_os = "linux"))] let create_result = std::fs::create_dir_all(&mount_path); create_result.map_err(|e| format!("Failed to create mount point: {}", e))?; #[cfg(unix)] { use std::os::unix::fs::PermissionsExt; let _ = std::fs::set_permissions(&mount_path, std::fs::Permissions::from_mode(0o700)); } - } else { - if let Ok(entries) = std::fs::read_dir(&mount_path) { - for entry in entries.flatten() { - let path = entry.path(); - if path.is_dir() { let _ = std::fs::remove_dir_all(&path); } - else { let _ = std::fs::remove_file(&path); } - } - log::info!("Cleaned stale mount point: {}", mount_path.display()); - } } + + if let Ok(entries) = std::fs::read_dir(&mount_path) { + for entry in entries.flatten() { + let path = entry.path(); + if path.is_dir() { let _ = std::fs::remove_dir_all(&path); } + else { let _ = std::fs::remove_file(&path); } + } + log::info!("Cleaned stale mount point: {}", mount_path.display()); + }As per coding guidelines, “App should clean stale mount contents before mounting FUSE vault at
~/CipherBoxto handle.DS_Storefiles from previous crashes”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src-tauri/src/fuse/mod.rs` around lines 101 - 124, The stale mount point cleanup code that removes old entries is only executed in the else branch when mount_path exists, but it should also run after successfully creating the mount point directory in the !mount_path.exists() branch to handle crash residue on the Linux recovery path. After the create_result.map_err call succeeds in the !mount_path.exists() block, add the same cleanup logic that currently exists in the else branch (the read_dir loop that removes stale files and directories, followed by the log message "Cleaned stale mount point") to ensure stale mount contents are cleaned regardless of which path is taken.Source: Coding guidelines
🧹 Nitpick comments (1)
crates/fuse/src/lib.rs (1)
2982-2985: ⚡ Quick winAvoid fixed sleep in this async durability test to reduce flakiness.
Line 2984 uses a fixed 200ms sleep before draining completions; slower CI runners can miss the window. Prefer bounded polling until the expected condition or timeout.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/fuse/src/lib.rs` around lines 2982 - 2985, Replace the fixed std::thread::sleep call with bounded polling to eliminate test flakiness. Instead of sleeping for a fixed 200 milliseconds before calling fs.drain_upload_completions(), implement a loop that repeatedly checks for the expected condition (that the entry has been retained and is ready to be drained) with a reasonable interval between checks, continuing until either the condition is met or a timeout is reached. This approach is more robust across CI environments with varying performance characteristics.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/fuse/src/lib.rs`:
- Around line 305-321: The resolve_sequence_strict method uses unwrap_or(0) when
parsing the sequence_number, which silently treats malformed payloads as 0 and
returns success instead of failing. In strict replay mode, this must fail
closed. Replace the unwrap_or(0) call on the parse::<u64>() operation with
proper error handling that converts any parse failure into a Result error and
propagates it up, ensuring that invalid sequence numbers are rejected rather
than silently converted to 0.
---
Outside diff comments:
In `@apps/desktop/src-tauri/src/fuse/mod.rs`:
- Around line 101-124: The stale mount point cleanup code that removes old
entries is only executed in the else branch when mount_path exists, but it
should also run after successfully creating the mount point directory in the
!mount_path.exists() branch to handle crash residue on the Linux recovery path.
After the create_result.map_err call succeeds in the !mount_path.exists() block,
add the same cleanup logic that currently exists in the else branch (the
read_dir loop that removes stale files and directories, followed by the log
message "Cleaned stale mount point") to ensure stale mount contents are cleaned
regardless of which path is taken.
---
Nitpick comments:
In `@crates/fuse/src/lib.rs`:
- Around line 2982-2985: Replace the fixed std::thread::sleep call with bounded
polling to eliminate test flakiness. Instead of sleeping for a fixed 200
milliseconds before calling fs.drain_upload_completions(), implement a loop that
repeatedly checks for the expected condition (that the entry has been retained
and is ready to be drained) with a reasonable interval between checks,
continuing until either the condition is met or a timeout is reached. This
approach is more robust across CI environments with varying performance
characteristics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0e76f919-9a41-4b2c-9d0e-efa19c3086b3
📒 Files selected for processing (20)
.planning/ROADMAP.md.planning/STATE.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-01-PLAN.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-01-SUMMARY.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-02-PLAN.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-02-SUMMARY.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-03-PLAN.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-03-SUMMARY.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-04-PLAN.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-04-SUMMARY.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-RESEARCH.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-SECURITY.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-VALIDATION.md.planning/phases/46-desktop-fuse-data-loss-bugs-replay-hardening/46-VERIFICATION.mdapps/desktop/src-tauri/src/fuse/mod.rsapps/desktop/src-tauri/vendor/fuser/src/lib_impl.rscrates/fuse/src/journal_helpers.rscrates/fuse/src/lib.rscrates/fuse/src/platform/linux.rscrates/fuse/src/test_support.rs
Mirror the Linux recover_stale_mount on macOS. A crashed FUSE-T session can leave ~/CipherBox as a stale smbfs mount backed by a dead in-process server, blocking remount until force-unmounted. Detect via mount(8) and clear via umount, then diskutil unmount force as a fallback. Best-effort and run before the exists()/create_dir_all decision, matching Linux. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Addressed the outside-diff-range finding from the CodeRabbit review in |
|
Also addressed the nitpick from the same review (fixed |
Phase 46 — Desktop FUSE data-loss bugs + replay hardening
Closes the desktop FUSE write-durability work deferred by Phase 45: three data-loss/durability fixes, two replay-path hardening follow-ups from PR #491, the remaining Rust test coverage (Phase 45 Tier 2), plus cross-platform parity for stale-mount recovery (macOS). Behavior-changing correctness/durability work.
Changes
Stale-mount recovery (Linux + macOS)
stat()returns ENOTCONN (soexists()lies) andcreate_dir_allthen fails with EEXIST. Detect authoritatively via/proc/self/mountinfoand clear viafusermount3 -u(then lazy-z -u); plus acreate_mount_point_dirEEXIST recover-then-retry belt-and-suspenders.~/CipherBoxas a stalesmbfsmount backed by a dead in-process server. Detect viamount(8)and clear viaumount, thendiskutil unmount force. Windows (WinFsp) self-heals — its FSD evicts the volume on host death — so no recovery is needed there.Replay-path hardening (PR #491 follow-ups)
resolve_sequence_strictnow returnsErron an unparseablesequence_numberinstead ofunwrap_or(0), so a bad payload retains the entry (addresses CodeRabbit review).Durability characterization + Tier-2 tests
release()durability (behavior already correct from PR feat(fuse): durable write journal with crash-recovery replay #487 / PR refactor(fuse): desktop FUSE write-durability cleanup #491).ReplySenderlimitation) plusjournal_helpersbuilder tests.Verification
cargo test -p cipherbox-fuse --features fuse→ 60 passed;cargo clippy --features fuse --testsclean;cargo checkfor the fuse + desktop crates clean.46-SECURITY.md).46-VERIFICATION.md).Crash-recovery validation (both platforms confirmed)
umountreturnsResource busy, and confirmedrecover_stale_mountcleared it via thediskutil unmount forcefallback. Themount(8)parser is also unit-tested against thesmbfsline format.kill -9→ restart → clean auto-recovery), validating the ENOTCONN /exists()-lies / EEXIST signature andfusermount3recovery the design hinges on.Notes
crates/fuse,apps/desktop); disjoint from Phase 47 (PR refactor: SDK folder-state and publish-path consolidation #494)..planning/STATE.md; expect a trivial conflict when landing the second.🤖 Generated with Claude Code