feat(fuse): durable write journal with crash-recovery replay#487
Conversation
Entire-Checkpoint: ae7bbe1126cc
Entire-Checkpoint: 8f94afd7fcd8
Entire-Checkpoint: f3c45f8a62fe
Entire-Checkpoint: d5bddac316f1
Entire-Checkpoint: 9cbc6d7a27e3
Finish the interrupted planner run: add plans 42-04..42-08 (live migration gate, controller wiring + api:generate, BullMQ outbox drain + drift report, non-BYO backfill script, Grafana cross-user-attempt alert). Correct stale cross-plan number references in 42-01/42-03 and finalize the ROADMAP entry. Covers all 13 CONTEXT decisions (D-01..D-13) and both requirement IDs across the 8-plan set; every new plan carries a STRIDE threat_model. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Entire-Checkpoint: 951500ea043f
Entire-Checkpoint: 0a2deed2f95d
…e registration - PendingUnpin entity with unique cid index, no userId per D-05 outbox design - Migration 1749000000000 creates pending_unpins table with idx_pending_unpins_cid - Migration 1749100000000 adds idx_pinned_cids_cid to pinned_cids for refcount queries - PendingUnpin registered in app.module entities array per DATABASE_EVOLUTION_PROTOCOL §4.2
- Assert deleteFile calls unpinFromIpfs, removeUsage, then fetchQuota in order - Assert fetchQuota rejection does not reject deleteFile and logs a warning - Assert both removeUsage and fetchQuota are invoked before resolve
…eleteFile - Fire fetchQuota() as fire-and-forget after removeUsage (D-12) - Rejection is swallowed via .catch(logger.warn) to keep delete path non-blocking - No new imports; logger already imported on line 3
- RED/GREEN TDD gates verified, 3 specs passing - One-line fire-and-forget fetchQuota after removeUsage (D-12)
…vice - unpinCrossUserAttempts counter for D-02 cross-user unpin audit - driftOrphanedPinsTotal counter for D-06 Kubo drift report - pendingUnpinsGauge for D-05 outbox depth monitoring
Entire-Checkpoint: db0759160182
- Applied AddPendingUnpins1749000000000 and AddPinnedCidCidIndex1749100000000 - Both migrations confirmed [X] in migration:show (entries 18 and 19) - pending_unpins table + idx_pending_unpins_cid + idx_pinned_cids_cid verified via to_regclass() - No source files modified; halted at checkpoint:human-verify for DB confirmation
- Provisions CipherBox Security rule group alert on cipherbox_unpin_cross_user_attempts_total - Fires on any non-zero rate(counter[5m]) — gt 0 threshold (D-10, D-02) - Uses GRAFANA_ALERTS_FOLDER_UID and GRAFANA_CLOUD_DATASOURCE_UID placeholder UIDs - noDataState/execErrState OK; for 5m; severity warning, operation unpin-security
…rovider mocks - six guardedUnpin cases: no-row-unknown, no-row-cross-user, refcount>0, refcount===0 success, refcount===0 Kubo-fail, advisory-lock-ordering - DataSource mock with transaction callback invoking manager mock - manager.getRepository returns per-entity mocks for PinnedCid and PendingUnpin - IPFS_PROVIDER mock and MetricsService mock injected into TestingModule
- SUMMARY.md for CipherBox Security alert rule provisioning
…nd refcount outbox - advisory xact lock as first statement via manager.query pg_advisory_xact_lock (D-04) - ownership check with cross-user detection, logger.warn, and metric inc (D-01, D-02) - in-transaction row delete as quota decrement (D-03) - refcount gate with orIgnore outbox insert when count hits zero (D-05) - post-commit best-effort Kubo unpin with outbox row cleanup on success (D-03 ordering) - Kubo failure leaves outbox row for BullMQ retry worker without rejecting the request - vault.module.ts adds PendingUnpin to forFeature and provides IPFS_PROVIDER locally without importing IpfsModule to avoid circular dependency
Entire-Checkpoint: c6c34f9e6764
…sation - Expand unpin describe to pass req with user.id; assert guardedUnpin called and ipfsProvider.unpinFile NOT called - Add Test 4: compensation routes through guardedUnpin on recordPin failure, not raw unpinFile - Add Test 5: compensation is best-effort; guardedUnpin rejection swallowed, original recordPin error rethrown - Add Test 3: upload happy path does not call guardedUnpin or unpinFile
RED phase: selectRowsToDelete D-09 predicate (BYO exclusion, CID presence check) and parseKuboPinLs NDJSON parser. Five behaviors per the plan spec. No implementation yet. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Seven cases: drain success, drain not-pinned=success, drain failure-leaves-row - Gauge publish after drain pass - Drift orphan detected increments counter without deleting - Drift all-accounted does not increment counter - Dispatch routing to drain/drift/no-op for unknown jobs - RED: fails because PendingUnpinProcessor does not exist yet
… guardedUnpin - Replace ipfsProvider.unpinFile in unpin() with vaultService.guardedUnpin(req.user.id, dto.cid) - Add @request() req to unpin() signature; remove controller-level fileUnpins.inc (now inside guardedUnpin per 42-03) - Replace ipfsProvider.unpinFile in upload compensation with vaultService.guardedUnpin; add D-13 race window comment - Regenerate api-client via pnpm api:generate; openapi.json formatting-only diff confirms D-11 no schema change
Release Preview
Cascade Details
|
Mark phase 43 (fuse-write-durability) as shipped in STATE.md. Verification passed (18/18 automated + human UAT on all 3 platforms); PR #487 opened. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: d7898f1e63e5
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #487 +/- ##
==========================================
- Coverage 62.57% 61.94% -0.63%
==========================================
Files 137 151 +14
Lines 10276 15241 +4965
Branches 1108 1108
==========================================
+ Hits 6430 9441 +3011
- Misses 3612 5566 +1954
Partials 234 234
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/desktop/src-tauri/src/commands/sync.rs (1)
28-36:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPrevent duplicate sync-daemon spawns (idempotency gap).
Line 28 spawns unconditionally, and Lines 33-36 replace
sync_triggerwithout shutting down or reusing an existing daemon. Re-entering auth/setup or calling the IPC command again can leave multiple daemon loops running concurrently (duplicate polling/status updates/notifications).Suggested fix
pub fn spawn_sync_daemon(app: tauri::AppHandle, state: &AppState) -> Result<(), String> { + // Idempotent start: if already started, do nothing. + if let Ok(guard) = state.sync_trigger.read() { + if guard.is_some() { + log::debug!("Sync daemon already running; skipping spawn"); + return Ok(()); + } + } + log::info!("Starting background sync daemon");Also applies to: 51-60
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/desktop/src-tauri/src/commands/sync.rs` around lines 28 - 36, The spawn_sync_daemon function unconditionally spawns a new daemon without checking if one is already running, which can result in multiple concurrent daemon instances. Before creating a new channel and spawning the daemon, check if state.sync_trigger already contains an existing sender. If it does, either return early since a daemon is already active, or first shut down the existing daemon by sending a termination signal before creating and storing the new sender. This ensures idempotency and prevents duplicate polling and status updates.
🧹 Nitpick comments (2)
crates/fuse/src/lib.rs (1)
1038-1058: 💤 Low valueMinor:
depthvariable counts nodes visited, not tree depth.The variable
depthis incremented per node popped from the queue (line 1058), so it actually counts total nodes visited rather than tree depth. For BFS, this means a wide tree with many folders at level 1 could exhaust the cap before reaching a deeper nested folder.The cap still effectively bounds network round trips (32 resolve+fetch operations), so it's functionally correct for its purpose. The comment at line 1039 ("limits network round trips") is accurate, but the variable name
depthis slightly misleading.This is minor and doesn't affect correctness for typical folder hierarchies.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/fuse/src/lib.rs` around lines 1038 - 1058, The variable `depth` in the BFS loop is misleading because it counts nodes visited (incremented per queue.pop_front() operation), not actual tree depth. This means a wide tree with many folders at the same level could hit the cap before reaching deeper nested folders. Rename the `depth` variable to a more accurate name like `nodes_visited` or `operations_count` throughout the function to better reflect that it's tracking the number of BFS operations/nodes processed rather than tree depth. Update the variable name in its initialization near the queue setup, the comparison in the if statement checking MAX_RESOLVE_DEPTH, and the increment operation.crates/sdk/src/queue.rs (1)
434-453: 💤 Low valueTest
journal_no_plaintextdoesn't actually test the probe string.The
plaintext_probevariable is defined at line 437 but never inserted into the test entry. The assertion at lines 441-444 will always pass trivially since the probe was never part of the data. The test still validates that certain forbidden keys ("plaintext","parent_ino") don't appear, which is useful, but the probe-based assertion is effectively a no-op.Consider either removing the unused probe or constructing a test entry that would contain plaintext if the schema were wrong.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sdk/src/queue.rs` around lines 434 - 453, The plaintext_probe variable defined in the journal_no_plaintext test is never actually inserted into the test entry, making the assertion that checks for its presence a no-op that will always pass trivially. Remove the unused plaintext_probe variable and its corresponding assertion (the first assert block checking !json_str.contains(plaintext_probe)), since the test still effectively validates that forbidden keys like "plaintext" and "parent_ino" are absent from the journal. If you want to make the probe check meaningful, you would need to modify how make_upload_entry is called or constructed to include plaintext that should not leak, but removing the ineffective assertion is the simpler fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/desktop/src-tauri/src/tray/status.rs`:
- Around line 47-50: The is_connected method in the TrayStatus implementation
currently treats WriteParked as a disconnected state by not including it in the
matches pattern. Add TrayStatus::WriteParked to the matches pattern in the
is_connected method so it returns true for WriteParked state, allowing the user
to access recovery actions. Additionally, update the menu gating logic in
apps/desktop/src-tauri/src/tray/mod.rs at lines 76-77 to ensure that recovery
actions (open, sync, logout) remain enabled when the status is WriteParked, even
though it was previously treated as disconnected.
In `@crates/fuse/src/lib.rs`:
- Around line 1394-1404: The `resolve_sequence` function currently converts all
errors to `String` format, forcing the caller to use fragile string matching
with `to_lowercase().contains("not found")`. Refactor the `resolve_sequence`
function to return `Result<u64, ApiError>` instead of `Result<u64, String>`,
ensuring errors are returned as the typed `ApiError` enum rather than being
converted to strings. Update the error handling at the call site in the match
statement to use pattern matching on `ApiError::IpnsNotFound` instead of
substring checking, replacing the string matching logic with direct pattern
matching against the specific error variant.
---
Outside diff comments:
In `@apps/desktop/src-tauri/src/commands/sync.rs`:
- Around line 28-36: The spawn_sync_daemon function unconditionally spawns a new
daemon without checking if one is already running, which can result in multiple
concurrent daemon instances. Before creating a new channel and spawning the
daemon, check if state.sync_trigger already contains an existing sender. If it
does, either return early since a daemon is already active, or first shut down
the existing daemon by sending a termination signal before creating and storing
the new sender. This ensures idempotency and prevents duplicate polling and
status updates.
---
Nitpick comments:
In `@crates/fuse/src/lib.rs`:
- Around line 1038-1058: The variable `depth` in the BFS loop is misleading
because it counts nodes visited (incremented per queue.pop_front() operation),
not actual tree depth. This means a wide tree with many folders at the same
level could hit the cap before reaching deeper nested folders. Rename the
`depth` variable to a more accurate name like `nodes_visited` or
`operations_count` throughout the function to better reflect that it's tracking
the number of BFS operations/nodes processed rather than tree depth. Update the
variable name in its initialization near the queue setup, the comparison in the
if statement checking MAX_RESOLVE_DEPTH, and the increment operation.
In `@crates/sdk/src/queue.rs`:
- Around line 434-453: The plaintext_probe variable defined in the
journal_no_plaintext test is never actually inserted into the test entry, making
the assertion that checks for its presence a no-op that will always pass
trivially. Remove the unused plaintext_probe variable and its corresponding
assertion (the first assert block checking !json_str.contains(plaintext_probe)),
since the test still effectively validates that forbidden keys like "plaintext"
and "parent_ino" are absent from the journal. If you want to make the probe
check meaningful, you would need to modify how make_upload_entry is called or
constructed to include plaintext that should not leak, but removing the
ineffective assertion is the simpler fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 952ce003-d38c-4239-b1f0-3e54cdd3a241
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (51)
.planning/ROADMAP.md.planning/STATE.md.planning/phases/43-fuse-write-durability/43-01-PLAN.md.planning/phases/43-fuse-write-durability/43-01-SUMMARY.md.planning/phases/43-fuse-write-durability/43-02-PLAN.md.planning/phases/43-fuse-write-durability/43-02-SUMMARY.md.planning/phases/43-fuse-write-durability/43-03-PLAN.md.planning/phases/43-fuse-write-durability/43-03-SUMMARY.md.planning/phases/43-fuse-write-durability/43-04-PLAN.md.planning/phases/43-fuse-write-durability/43-04-SUMMARY.md.planning/phases/43-fuse-write-durability/43-05-PLAN.md.planning/phases/43-fuse-write-durability/43-05-SUMMARY.md.planning/phases/43-fuse-write-durability/43-06-PLAN.md.planning/phases/43-fuse-write-durability/43-06-SUMMARY.md.planning/phases/43-fuse-write-durability/43-07-PLAN.md.planning/phases/43-fuse-write-durability/43-07-SUMMARY.md.planning/phases/43-fuse-write-durability/43-08-PLAN.md.planning/phases/43-fuse-write-durability/43-08-SUMMARY.md.planning/phases/43-fuse-write-durability/43-CONTEXT.md.planning/phases/43-fuse-write-durability/43-DISCUSSION-LOG.md.planning/phases/43-fuse-write-durability/43-RESEARCH.md.planning/phases/43-fuse-write-durability/43-REVIEW.md.planning/phases/43-fuse-write-durability/43-UAT.md.planning/phases/43-fuse-write-durability/43-VALIDATION.md.planning/phases/43-fuse-write-durability/43-VERIFICATION.md.planning/todos/pending/2026-06-14-consolidate-fuser-and-winfsp-journal-write-paths.md.planning/todos/pending/2026-06-14-extract-shared-journal-dir-and-retries-helper.md.planning/todos/pending/2026-06-14-memoize-resolve-folder-key-during-replay.md.planning/todos/pending/2026-06-14-recover-stale-fuse-mount-after-crash-on-linux-startup.md.planning/todos/pending/2026-06-14-replace-empty-string-journal-key-sentinel-with-option.md.planning/todos/pending/2026-06-14-replace-not-found-string-match-with-typed-error.md.planning/todos/pending/2026-06-14-reuse-publish-file-metadata-and-cas-publish-in-replay.mdapps/api/.env.exampleapps/desktop/src-tauri/src/commands/auth.rsapps/desktop/src-tauri/src/commands/sync.rsapps/desktop/src-tauri/src/fuse/mod.rsapps/desktop/src-tauri/src/fuse/windows/mod.rsapps/desktop/src-tauri/src/sync/mod.rsapps/desktop/src-tauri/src/tray/mod.rsapps/desktop/src-tauri/src/tray/status.rscrates/fuse/Cargo.tomlcrates/fuse/src/lib.rscrates/fuse/src/platform/windows/write_ops.rscrates/fuse/src/read_ops.rscrates/fuse/src/write_ops.rscrates/sdk/src/client.rscrates/sdk/src/lib.rscrates/sdk/src/queue.rscrates/sdk/src/state.rscrates/sdk/src/sync.rsrelease-please-config.json
Pull the SyncStatus -> TrayStatus mapping and the D-10 park-notification edge-trigger decision out of the create_sync_daemon closure into a pure bridge_status() returning a TrayBridgeOutcome (tray status, watermark, whether to notify). The closure now just applies the decision and performs the OS side effects. Behavior is unchanged (load+store is equivalent to the previous swap on the serially-invoked callback). Adds 8 unit tests covering the subtle anti-spam logic that previously had zero coverage: notify once per newly-parked failure, stay silent at the same level, re-notify only when the count rises, re-arm after the journal clears, and critically that Syncing preserves the watermark (resetting it would re-arm the toast every poll — the exact regression commit 5b6455c fixed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: be30a7d8f988
The cargo-linux job already runs `cargo llvm-cov --workspace`, so desktop-lcov.info already contains coverage for all crates/** (incl. the cipherbox-fuse and cipherbox-sdk unit tests). It was only uploaded under the `desktop` flag, whose path filter (apps/desktop/src-tauri/src/**) discards every crate line — so the journal/replay/queue logic and its tests were invisible to codecov. Add a `rust` flag (paths: crates/**) and re-upload the same lcov under it, plus an informational project status. No new test run; this just stops dropping coverage we already collect. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: 9f3624db943f
Addresses a CodeRabbit review on PR #487. is_connected() gates the tray Logout item; treating WriteParked as disconnected locked an authenticated user with parked uploads out of Logout. Include WriteParked in is_connected() and update the unit test. Open/Sync stay gated by is_mounted/is_syncable (unchanged). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: 9396fb86eeb7
Addresses a CodeRabbit review on PR #487. spawn_sync_daemon overwrote state.sync_trigger and tokio::spawn'd unconditionally, so calling it twice (re-auth, or the start_sync_daemon IPC command) left two daemon loops polling — doubling every tray status update and WriteParked notification, which undermines the D-10 anti-spam work in this phase. Add an idempotency guard: if a daemon is already running, skip. The daemon reads auth/vault state from the shared KeyState each poll, so a single long-lived daemon adapts across re-auth without a restart. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: cee7722d6b23
Two CodeRabbit nitpicks on PR #487: - crates/fuse/src/lib.rs: rename the BFS counter `depth`/`MAX_RESOLVE_DEPTH` to `nodes_visited`/`MAX_RESOLVE_NODES`. It counts nodes visited, not tree depth; the cap bounds total network round trips. No behavior change. - crates/sdk/src/queue.rs: the journal_no_plaintext test defined a plaintext probe that was never inserted, making that assertion a no-op. Replace it with a meaningful check that the raw key bytes (b"wrappedkey") never appear — verifying keys are stored hex-encoded, matching the test's "no raw key bytes" intent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: d3f60ddea1d6
Addressed CodeRabbit review (#pullrequestreview-4492966857)All three out-of-diff / nitpick comments handled: Outside-diff range (🟠 Major) — duplicate sync-daemon spawns → fixed in 29be9d1. Nitpick — Nitpick — Verified: |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
codecov.yml (1)
65-72: ⚡ Quick winAdd
rustto the default aggregate flagsYou added a dedicated
rustflag/status, butcoverage.status.project.default.flagsstill excludes it. Includerustthere so the aggregate project signal reflects crates coverage too.Proposed fix
default: target: auto threshold: 6% flags: - api - crypto - core - api-client - sdk-core - sdk + - rustBased on learnings, this repo’s aggregate Codecov check should include all relevant flags while retaining per-component statuses.
Also applies to: 108-111
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codecov.yml` around lines 65 - 72, The rust flag has been added to the crates configuration section, but it is not included in the coverage.status.project.default.flags list, which means the aggregate project coverage signal does not include rust crate coverage. Add rust to the default.flags array in the coverage.status.project section to ensure the aggregate codecov check reflects coverage from all relevant components including the rust crates.Source: Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 670-676: The codecov/codecov-action references in the workflow are
using an unpinned version tag (`@v7`) which violates immutable action policy. Pin
all three instances of codecov/codecov-action@v7 (at lines 335, 660, and 671) to
a specific commit SHA instead of the version tag. Replace `@v7` with
@[full-commit-sha] for each occurrence, ensuring all three use the same SHA
value for consistency and to reduce supply-chain risk.
In `@apps/desktop/src-tauri/src/commands/sync.rs`:
- Around line 33-38: The current implementation has a race condition where the
check for whether the sync daemon is already running uses a read lock, allowing
multiple callers to pass the check before any can set the value, potentially
spawning multiple daemons. Additionally, poisoned lock errors are being silently
ignored. Replace the read lock check with a single write lock section that
atomically checks if the sync daemon sender exists and sets it in one operation;
if the write lock acquisition fails (including poisoned lock cases), return an
error rather than ignoring it. This ensures only one daemon is spawned across
concurrent calls to the sync command function.
---
Nitpick comments:
In `@codecov.yml`:
- Around line 65-72: The rust flag has been added to the crates configuration
section, but it is not included in the coverage.status.project.default.flags
list, which means the aggregate project coverage signal does not include rust
crate coverage. Add rust to the default.flags array in the
coverage.status.project section to ensure the aggregate codecov check reflects
coverage from all relevant components including the rust crates.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 26757bdb-79a4-4e2f-bfa5-6ab192489873
📒 Files selected for processing (7)
.github/workflows/ci.ymlapps/desktop/src-tauri/src/commands/sync.rsapps/desktop/src-tauri/src/sync/mod.rsapps/desktop/src-tauri/src/tray/status.rscodecov.ymlcrates/fuse/src/lib.rscrates/sdk/src/queue.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- apps/desktop/src-tauri/src/sync/mod.rs
- crates/sdk/src/queue.rs
PR #487 codecov reports 50.1% patch coverage (638 lines missing, but the patch status is informational/non-blocking). Capture a tiered follow-up: Tier 1 quick unit-test wins (sdk sync.rs redaction helpers, queue.rs and sync/mod.rs residual branches), Tier 2 trait-seam harness for the FUSE callback + replay paths (blocked on the two write-path consolidation todos), Tier 3 the Tauri glue that is realistically UAT-only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: fd2e0cd57614
The idempotency guard added in 29be9d1 checked sync_trigger under a read lock and then stored the sender under a separate write lock — a TOCTOU race where two concurrent callers could both pass the check and each spawn a daemon (the second overwriting the first's sender). It also silently ignored a poisoned lock, which would spawn a daemon whose sender the tray "Sync Now" button could never reach. Collapse check-and-set into one write-lock critical section and surface a poisoned lock as an error. Move the "Starting" info log after the guard so it only fires when a daemon is actually spawned. Addresses CodeRabbit on PR #487. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: 970bd0c631a8
Deferred from PR #487 (CodeRabbit/zizmor flagged the codecov refs). Capture the repo-wide follow-up: pin all 111 third-party action refs across the 14 workflow files to immutable commit SHAs, paired with Dependabot github-actions updates so pins don't go stale, plus an optional zizmor CI gate covering the related excessive-permissions finding. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Entire-Checkpoint: 244dacd08641
Summary
Phase 43: FUSE Write Durability — Verified ✓ (18/18 automated must-haves; human UAT completed on macOS, Linux, and Windows).
Makes FUSE writes durable.
release()/ WinFspcleanup()now journal each pending upload to a crash-safe, fsync-before-ack on-disk journal (cb-journal) before acking the OS, so a crash or app quit no longer silently loses a write. On mount,replay_for_vaultreplays journaled entries — re-uploading content and signing/publishing the parent IPNS record from a journaled user-ECIES-wrapped key. mkdir parent-publish conflicts now enqueue a retry instead of orphaning the child folder, and exhausted retries park the entry asFailedand surface a tray + OS notification.Changes
crates/sdk/src/queue.rs): persistentWriteQueuewith fsync-before-return,0o600atomic perms + parent-dir fsync, vault scoping, skip-on-malformed, and arecord_failureretry/park lifecycle.crates/fuse/src/lib.rs):replay_for_vault+replay_mkdir_entry/replay_upload_entry; CAS parent-IPNS publish from journaled user-wrapped keys; BFSresolve_folder_key; ordered replay (mkdir-before-upload, thencreated_at_ms).read_ops.rs,write_ops.rs, winfspwrite_ops.rs): journal the entry before reply; in-memory mutations deferred until after the journal commit, so a prepare/journal failure leaves no partial state.crates/sdk/src/sync.rs,apps/desktop/src-tauri/src/sync,tray): the sync daemon reads the realcb-journal, emitsWriteParked, bridged (edge-triggered) to tray status + an OS notification.replay_for_vaultwith a seededPublishCoordinator; journal removal is replay-only.Verification
cargo check/testgreen forcipherbox-fuseandcipherbox-sdk; CI compiles + tests the workspace for both--features winfspand--features fuse.43-REVIEW.mdverified resolved via code cross-check + a CodeRabbit re-review.Key Decisions & Notes
341252d74,a1ec69f1b)..planning/todos/pending/(commit46b3e1326): fuser↔winfsp consolidation,Option<String>journal keys, shared journal-dir helper, replay delegation topublish_file_metadata,resolve_folder_keymemoization, typed not-found error.Merge note
The branch predates the squash-merge of phase 42 (PR #485, already in
main), so the commit list includes 22 redundant phase-42 commits that contribute zero net diff — the "Files changed" view is phase-43-only. Recommend squash-merge.🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes