Skip to content

feat(fuse): durable write journal with crash-recovery replay#487

Merged
FSM1 merged 116 commits into
mainfrom
feat/fuse-write-durability
Jun 14, 2026
Merged

feat(fuse): durable write journal with crash-recovery replay#487
FSM1 merged 116 commits into
mainfrom
feat/fuse-write-durability

Conversation

@FSM1

@FSM1 FSM1 commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

Phase 43: FUSE Write Durability — Verified ✓ (18/18 automated must-haves; human UAT completed on macOS, Linux, and Windows).

Makes FUSE writes durable. release() / WinFsp cleanup() now journal each pending upload to a crash-safe, fsync-before-ack on-disk journal (cb-journal) before acking the OS, so a crash or app quit no longer silently loses a write. On mount, replay_for_vault replays journaled entries — re-uploading content and signing/publishing the parent IPNS record from a journaled user-ECIES-wrapped key. mkdir parent-publish conflicts now enqueue a retry instead of orphaning the child folder, and exhausted retries park the entry as Failed and surface a tray + OS notification.

Changes

  • Durable write journal (crates/sdk/src/queue.rs): persistent WriteQueue with fsync-before-return, 0o600 atomic perms + parent-dir fsync, vault scoping, skip-on-malformed, and a record_failure retry/park lifecycle.
  • Crash-recovery replay (crates/fuse/src/lib.rs): replay_for_vault + replay_mkdir_entry / replay_upload_entry; CAS parent-IPNS publish from journaled user-wrapped keys; BFS resolve_folder_key; ordered replay (mkdir-before-upload, then created_at_ms).
  • fsync-before-ack write paths (read_ops.rs, write_ops.rs, winfsp write_ops.rs): journal the entry before reply; in-memory mutations deferred until after the journal commit, so a prepare/journal failure leaves no partial state.
  • Park → notify pipeline (crates/sdk/src/sync.rs, apps/desktop/src-tauri/src/sync, tray): the sync daemon reads the real cb-journal, emits WriteParked, bridged (edge-triggered) to tray status + an OS notification.
  • Windows / winfsp parity: mount calls replay_for_vault with a seeded PublishCoordinator; journal removal is replay-only.

Verification

  • Automated: 18/18 must-haves verified. cargo check/test green for cipherbox-fuse and cipherbox-sdk; CI compiles + tests the workspace for both --features winfsp and --features fuse.
  • Human UAT (macOS, Linux, Windows — headless UAT passed on all): SIGKILL replay, park-notification render, mkdir-orphan survival, ciphertext-only journal.
  • Post-review: all 8 critical findings (CR-01–CR-08) from 43-REVIEW.md verified resolved via code cross-check + a CodeRabbit re-review.

Key Decisions & Notes

  • CR-03 residual + journal key-wrap hardening, the CR-04 deferred-mutation fix, and 3 cleanups landed post-review (commits 341252d74, a1ec69f1b).
  • 6 deferred quality follow-ups captured as todos under .planning/todos/pending/ (commit 46b3e1326): fuser↔winfsp consolidation, Option<String> journal keys, shared journal-dir helper, replay delegation to publish_file_metadata, resolve_folder_key memoization, typed not-found error.
  • Zero-knowledge preserved: the journal stores only ciphertext, user-ECIES-wrapped keys, IVs, and IPNS names — never plaintext.

Merge note

The branch predates the squash-merge of phase 42 (PR #485, already in main), so the commit list includes 22 redundant phase-42 commits that contribute zero net diff — the "Files changed" view is phase-43-only. Recommend squash-merge.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features
    • Added durable write journaling for FUSE/WinFsp uploads: data is persisted before acknowledgement.
    • Added “Upload Failed” (WriteParked) tray/status state and “CipherBox Upload Failed” system notifications for permanently failed writes.
    • Improved directory creation resilience by journaling and signaling parent publish conflicts for automatic retry.
    • Added mount-time journal replay to recover and re-apply pending operations in the correct order.
  • Bug Fixes
    • Ensured safer retry/cleanup behavior so failed operations are parked and surfaced instead of silently disappearing.

FSM1 and others added 30 commits June 12, 2026 13:02
Entire-Checkpoint: ae7bbe1126cc
Entire-Checkpoint: 8f94afd7fcd8
Entire-Checkpoint: f3c45f8a62fe
Entire-Checkpoint: d5bddac316f1
Finish the interrupted planner run: add plans 42-04..42-08 (live migration
gate, controller wiring + api:generate, BullMQ outbox drain + drift report,
non-BYO backfill script, Grafana cross-user-attempt alert). Correct stale
cross-plan number references in 42-01/42-03 and finalize the ROADMAP entry.

Covers all 13 CONTEXT decisions (D-01..D-13) and both requirement IDs across
the 8-plan set; every new plan carries a STRIDE threat_model.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Entire-Checkpoint: 951500ea043f
…e registration

- PendingUnpin entity with unique cid index, no userId per D-05 outbox design
- Migration 1749000000000 creates pending_unpins table with idx_pending_unpins_cid
- Migration 1749100000000 adds idx_pinned_cids_cid to pinned_cids for refcount queries
- PendingUnpin registered in app.module entities array per DATABASE_EVOLUTION_PROTOCOL §4.2
- Assert deleteFile calls unpinFromIpfs, removeUsage, then fetchQuota in order
- Assert fetchQuota rejection does not reject deleteFile and logs a warning
- Assert both removeUsage and fetchQuota are invoked before resolve
…eleteFile

- Fire fetchQuota() as fire-and-forget after removeUsage (D-12)
- Rejection is swallowed via .catch(logger.warn) to keep delete path non-blocking
- No new imports; logger already imported on line 3
- RED/GREEN TDD gates verified, 3 specs passing
- One-line fire-and-forget fetchQuota after removeUsage (D-12)
…vice

- unpinCrossUserAttempts counter for D-02 cross-user unpin audit
- driftOrphanedPinsTotal counter for D-06 Kubo drift report
- pendingUnpinsGauge for D-05 outbox depth monitoring
Entire-Checkpoint: db0759160182
- Applied AddPendingUnpins1749000000000 and AddPinnedCidCidIndex1749100000000
- Both migrations confirmed [X] in migration:show (entries 18 and 19)
- pending_unpins table + idx_pending_unpins_cid + idx_pinned_cids_cid verified via to_regclass()
- No source files modified; halted at checkpoint:human-verify for DB confirmation
- Provisions CipherBox Security rule group alert on cipherbox_unpin_cross_user_attempts_total
- Fires on any non-zero rate(counter[5m]) — gt 0 threshold (D-10, D-02)
- Uses GRAFANA_ALERTS_FOLDER_UID and GRAFANA_CLOUD_DATASOURCE_UID placeholder UIDs
- noDataState/execErrState OK; for 5m; severity warning, operation unpin-security
…rovider mocks

- six guardedUnpin cases: no-row-unknown, no-row-cross-user, refcount>0,
  refcount===0 success, refcount===0 Kubo-fail, advisory-lock-ordering
- DataSource mock with transaction callback invoking manager mock
- manager.getRepository returns per-entity mocks for PinnedCid and PendingUnpin
- IPFS_PROVIDER mock and MetricsService mock injected into TestingModule
- SUMMARY.md for CipherBox Security alert rule provisioning
…nd refcount outbox

- advisory xact lock as first statement via manager.query pg_advisory_xact_lock (D-04)
- ownership check with cross-user detection, logger.warn, and metric inc (D-01, D-02)
- in-transaction row delete as quota decrement (D-03)
- refcount gate with orIgnore outbox insert when count hits zero (D-05)
- post-commit best-effort Kubo unpin with outbox row cleanup on success (D-03 ordering)
- Kubo failure leaves outbox row for BullMQ retry worker without rejecting the request
- vault.module.ts adds PendingUnpin to forFeature and provides IPFS_PROVIDER locally
  without importing IpfsModule to avoid circular dependency
Entire-Checkpoint: c6c34f9e6764
…sation

- Expand unpin describe to pass req with user.id; assert guardedUnpin called and ipfsProvider.unpinFile NOT called
- Add Test 4: compensation routes through guardedUnpin on recordPin failure, not raw unpinFile
- Add Test 5: compensation is best-effort; guardedUnpin rejection swallowed, original recordPin error rethrown
- Add Test 3: upload happy path does not call guardedUnpin or unpinFile
RED phase: selectRowsToDelete D-09 predicate (BYO exclusion, CID
presence check) and parseKuboPinLs NDJSON parser. Five behaviors per
the plan spec. No implementation yet.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Seven cases: drain success, drain not-pinned=success, drain failure-leaves-row
- Gauge publish after drain pass
- Drift orphan detected increments counter without deleting
- Drift all-accounted does not increment counter
- Dispatch routing to drain/drift/no-op for unknown jobs
- RED: fails because PendingUnpinProcessor does not exist yet
… guardedUnpin

- Replace ipfsProvider.unpinFile in unpin() with vaultService.guardedUnpin(req.user.id, dto.cid)
- Add @request() req to unpin() signature; remove controller-level fileUnpins.inc (now inside guardedUnpin per 42-03)
- Replace ipfsProvider.unpinFile in upload compensation with vaultService.guardedUnpin; add D-13 race window comment
- Regenerate api-client via pnpm api:generate; openapi.json formatting-only diff confirms D-11 no schema change
@github-actions github-actions Bot added release:cipherbox-sdk:feat Minor version bump (new feature) for cipherbox-sdk release:desktop:feat Minor version bump (new feature) for desktop release:cipherbox-fuse:feat Minor version bump (new feature) for cipherbox-fuse release:sdk-core:fix Patch version bump (bug fix) for sdk-core release:sdk:fix Patch version bump (bug fix) for sdk release:tee-worker:fix Patch version bump (bug fix) for tee-worker labels Jun 14, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Release Preview

Package Bump Label Source
api minor release:api:feat Direct (feat commit)
cipherbox-fuse minor release:cipherbox-fuse:feat Direct (feat commit)
cipherbox-sdk minor release:cipherbox-sdk:feat Direct (feat commit)
desktop minor release:desktop:feat Direct (feat commit)
sdk patch release:sdk:fix Cascade (api minor)
sdk-core patch release:sdk-core:fix Cascade (api minor)
tee-worker patch release:tee-worker:fix Cascade (sdk-core patch)
web minor release:web:feat Direct (feat commit)

Cascade Details

  • api minor -> sdk-core patch (direct dependency)
  • api minor -> sdk patch (direct dependency)
  • sdk-core patch -> tee-worker patch (direct dependency)

github-actions Bot and others added 2 commits June 14, 2026 13:03
Mark phase 43 (fuse-write-durability) as shipped in STATE.md. Verification
passed (18/18 automated + human UAT on all 3 platforms); PR #487 opened.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: d7898f1e63e5
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 641 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.94%. Comparing base (158addc) to head (610be5d).

Files with missing lines Patch % Lines
crates/fuse/src/lib.rs 28.75% 342 Missing ⚠️
crates/fuse/src/read_ops.rs 0.00% 141 Missing ⚠️
crates/fuse/src/write_ops.rs 0.00% 39 Missing ⚠️
apps/desktop/src-tauri/src/fuse/mod.rs 0.00% 35 Missing ⚠️
crates/sdk/src/sync.rs 0.00% 24 Missing ⚠️
apps/desktop/src-tauri/src/commands/sync.rs 0.00% 21 Missing ⚠️
apps/desktop/src-tauri/src/sync/mod.rs 87.38% 14 Missing ⚠️
crates/sdk/src/queue.rs 97.51% 10 Missing ⚠️
apps/desktop/src-tauri/src/tray/mod.rs 0.00% 9 Missing ⚠️
apps/desktop/src-tauri/src/commands/auth.rs 0.00% 3 Missing ⚠️
... and 1 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #487      +/-   ##
==========================================
- Coverage   62.57%   61.94%   -0.63%     
==========================================
  Files         137      151      +14     
  Lines       10276    15241    +4965     
  Branches     1108     1108              
==========================================
+ Hits         6430     9441    +3011     
- Misses       3612     5566    +1954     
  Partials      234      234              
Flag Coverage Δ
api 84.62% <ø> (ø)
api-client 84.62% <ø> (ø)
core 84.62% <ø> (ø)
crypto 84.62% <ø> (ø)
desktop 17.54% <55.43%> (-13.71%) ⬇️
rust 53.64% <49.08%> (?)
sdk 84.62% <ø> (ø)
sdk-core 84.62% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
apps/desktop/src-tauri/src/tray/status.rs 100.00% <100.00%> (ø)
crates/sdk/src/state.rs 100.00% <100.00%> (ø)
apps/desktop/src-tauri/src/commands/auth.rs 0.00% <0.00%> (ø)
crates/sdk/src/client.rs 0.00% <0.00%> (ø)
apps/desktop/src-tauri/src/tray/mod.rs 0.00% <0.00%> (ø)
crates/sdk/src/queue.rs 97.66% <97.51%> (ø)
apps/desktop/src-tauri/src/sync/mod.rs 80.83% <87.38%> (+80.83%) ⬆️
apps/desktop/src-tauri/src/commands/sync.rs 0.00% <0.00%> (ø)
crates/sdk/src/sync.rs 0.00% <0.00%> (ø)
apps/desktop/src-tauri/src/fuse/mod.rs 11.31% <0.00%> (+11.31%) ⬆️
... and 3 more

... and 58 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/desktop/src-tauri/src/commands/sync.rs (1)

28-36: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent duplicate sync-daemon spawns (idempotency gap).

Line 28 spawns unconditionally, and Lines 33-36 replace sync_trigger without shutting down or reusing an existing daemon. Re-entering auth/setup or calling the IPC command again can leave multiple daemon loops running concurrently (duplicate polling/status updates/notifications).

Suggested fix
 pub fn spawn_sync_daemon(app: tauri::AppHandle, state: &AppState) -> Result<(), String> {
+    // Idempotent start: if already started, do nothing.
+    if let Ok(guard) = state.sync_trigger.read() {
+        if guard.is_some() {
+            log::debug!("Sync daemon already running; skipping spawn");
+            return Ok(());
+        }
+    }
+
     log::info!("Starting background sync daemon");

Also applies to: 51-60

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src-tauri/src/commands/sync.rs` around lines 28 - 36, The
spawn_sync_daemon function unconditionally spawns a new daemon without checking
if one is already running, which can result in multiple concurrent daemon
instances. Before creating a new channel and spawning the daemon, check if
state.sync_trigger already contains an existing sender. If it does, either
return early since a daemon is already active, or first shut down the existing
daemon by sending a termination signal before creating and storing the new
sender. This ensures idempotency and prevents duplicate polling and status
updates.
🧹 Nitpick comments (2)
crates/fuse/src/lib.rs (1)

1038-1058: 💤 Low value

Minor: depth variable counts nodes visited, not tree depth.

The variable depth is incremented per node popped from the queue (line 1058), so it actually counts total nodes visited rather than tree depth. For BFS, this means a wide tree with many folders at level 1 could exhaust the cap before reaching a deeper nested folder.

The cap still effectively bounds network round trips (32 resolve+fetch operations), so it's functionally correct for its purpose. The comment at line 1039 ("limits network round trips") is accurate, but the variable name depth is slightly misleading.

This is minor and doesn't affect correctness for typical folder hierarchies.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/fuse/src/lib.rs` around lines 1038 - 1058, The variable `depth` in the
BFS loop is misleading because it counts nodes visited (incremented per
queue.pop_front() operation), not actual tree depth. This means a wide tree with
many folders at the same level could hit the cap before reaching deeper nested
folders. Rename the `depth` variable to a more accurate name like
`nodes_visited` or `operations_count` throughout the function to better reflect
that it's tracking the number of BFS operations/nodes processed rather than tree
depth. Update the variable name in its initialization near the queue setup, the
comparison in the if statement checking MAX_RESOLVE_DEPTH, and the increment
operation.
crates/sdk/src/queue.rs (1)

434-453: 💤 Low value

Test journal_no_plaintext doesn't actually test the probe string.

The plaintext_probe variable is defined at line 437 but never inserted into the test entry. The assertion at lines 441-444 will always pass trivially since the probe was never part of the data. The test still validates that certain forbidden keys ("plaintext", "parent_ino") don't appear, which is useful, but the probe-based assertion is effectively a no-op.

Consider either removing the unused probe or constructing a test entry that would contain plaintext if the schema were wrong.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/sdk/src/queue.rs` around lines 434 - 453, The plaintext_probe variable
defined in the journal_no_plaintext test is never actually inserted into the
test entry, making the assertion that checks for its presence a no-op that will
always pass trivially. Remove the unused plaintext_probe variable and its
corresponding assertion (the first assert block checking
!json_str.contains(plaintext_probe)), since the test still effectively validates
that forbidden keys like "plaintext" and "parent_ino" are absent from the
journal. If you want to make the probe check meaningful, you would need to
modify how make_upload_entry is called or constructed to include plaintext that
should not leak, but removing the ineffective assertion is the simpler fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/desktop/src-tauri/src/tray/status.rs`:
- Around line 47-50: The is_connected method in the TrayStatus implementation
currently treats WriteParked as a disconnected state by not including it in the
matches pattern. Add TrayStatus::WriteParked to the matches pattern in the
is_connected method so it returns true for WriteParked state, allowing the user
to access recovery actions. Additionally, update the menu gating logic in
apps/desktop/src-tauri/src/tray/mod.rs at lines 76-77 to ensure that recovery
actions (open, sync, logout) remain enabled when the status is WriteParked, even
though it was previously treated as disconnected.

In `@crates/fuse/src/lib.rs`:
- Around line 1394-1404: The `resolve_sequence` function currently converts all
errors to `String` format, forcing the caller to use fragile string matching
with `to_lowercase().contains("not found")`. Refactor the `resolve_sequence`
function to return `Result<u64, ApiError>` instead of `Result<u64, String>`,
ensuring errors are returned as the typed `ApiError` enum rather than being
converted to strings. Update the error handling at the call site in the match
statement to use pattern matching on `ApiError::IpnsNotFound` instead of
substring checking, replacing the string matching logic with direct pattern
matching against the specific error variant.

---

Outside diff comments:
In `@apps/desktop/src-tauri/src/commands/sync.rs`:
- Around line 28-36: The spawn_sync_daemon function unconditionally spawns a new
daemon without checking if one is already running, which can result in multiple
concurrent daemon instances. Before creating a new channel and spawning the
daemon, check if state.sync_trigger already contains an existing sender. If it
does, either return early since a daemon is already active, or first shut down
the existing daemon by sending a termination signal before creating and storing
the new sender. This ensures idempotency and prevents duplicate polling and
status updates.

---

Nitpick comments:
In `@crates/fuse/src/lib.rs`:
- Around line 1038-1058: The variable `depth` in the BFS loop is misleading
because it counts nodes visited (incremented per queue.pop_front() operation),
not actual tree depth. This means a wide tree with many folders at the same
level could hit the cap before reaching deeper nested folders. Rename the
`depth` variable to a more accurate name like `nodes_visited` or
`operations_count` throughout the function to better reflect that it's tracking
the number of BFS operations/nodes processed rather than tree depth. Update the
variable name in its initialization near the queue setup, the comparison in the
if statement checking MAX_RESOLVE_DEPTH, and the increment operation.

In `@crates/sdk/src/queue.rs`:
- Around line 434-453: The plaintext_probe variable defined in the
journal_no_plaintext test is never actually inserted into the test entry, making
the assertion that checks for its presence a no-op that will always pass
trivially. Remove the unused plaintext_probe variable and its corresponding
assertion (the first assert block checking !json_str.contains(plaintext_probe)),
since the test still effectively validates that forbidden keys like "plaintext"
and "parent_ino" are absent from the journal. If you want to make the probe
check meaningful, you would need to modify how make_upload_entry is called or
constructed to include plaintext that should not leak, but removing the
ineffective assertion is the simpler fix.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 952ce003-d38c-4239-b1f0-3e54cdd3a241

📥 Commits

Reviewing files that changed from the base of the PR and between 158addc and c8137ed.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (51)
  • .planning/ROADMAP.md
  • .planning/STATE.md
  • .planning/phases/43-fuse-write-durability/43-01-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-01-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-02-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-02-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-03-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-03-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-04-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-04-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-05-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-05-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-06-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-06-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-07-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-07-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-08-PLAN.md
  • .planning/phases/43-fuse-write-durability/43-08-SUMMARY.md
  • .planning/phases/43-fuse-write-durability/43-CONTEXT.md
  • .planning/phases/43-fuse-write-durability/43-DISCUSSION-LOG.md
  • .planning/phases/43-fuse-write-durability/43-RESEARCH.md
  • .planning/phases/43-fuse-write-durability/43-REVIEW.md
  • .planning/phases/43-fuse-write-durability/43-UAT.md
  • .planning/phases/43-fuse-write-durability/43-VALIDATION.md
  • .planning/phases/43-fuse-write-durability/43-VERIFICATION.md
  • .planning/todos/pending/2026-06-14-consolidate-fuser-and-winfsp-journal-write-paths.md
  • .planning/todos/pending/2026-06-14-extract-shared-journal-dir-and-retries-helper.md
  • .planning/todos/pending/2026-06-14-memoize-resolve-folder-key-during-replay.md
  • .planning/todos/pending/2026-06-14-recover-stale-fuse-mount-after-crash-on-linux-startup.md
  • .planning/todos/pending/2026-06-14-replace-empty-string-journal-key-sentinel-with-option.md
  • .planning/todos/pending/2026-06-14-replace-not-found-string-match-with-typed-error.md
  • .planning/todos/pending/2026-06-14-reuse-publish-file-metadata-and-cas-publish-in-replay.md
  • apps/api/.env.example
  • apps/desktop/src-tauri/src/commands/auth.rs
  • apps/desktop/src-tauri/src/commands/sync.rs
  • apps/desktop/src-tauri/src/fuse/mod.rs
  • apps/desktop/src-tauri/src/fuse/windows/mod.rs
  • apps/desktop/src-tauri/src/sync/mod.rs
  • apps/desktop/src-tauri/src/tray/mod.rs
  • apps/desktop/src-tauri/src/tray/status.rs
  • crates/fuse/Cargo.toml
  • crates/fuse/src/lib.rs
  • crates/fuse/src/platform/windows/write_ops.rs
  • crates/fuse/src/read_ops.rs
  • crates/fuse/src/write_ops.rs
  • crates/sdk/src/client.rs
  • crates/sdk/src/lib.rs
  • crates/sdk/src/queue.rs
  • crates/sdk/src/state.rs
  • crates/sdk/src/sync.rs
  • release-please-config.json

Comment thread apps/desktop/src-tauri/src/tray/status.rs Outdated
Comment thread crates/fuse/src/lib.rs
FSM1 and others added 5 commits June 14, 2026 15:34
Pull the SyncStatus -> TrayStatus mapping and the D-10 park-notification
edge-trigger decision out of the create_sync_daemon closure into a pure
bridge_status() returning a TrayBridgeOutcome (tray status, watermark, whether
to notify). The closure now just applies the decision and performs the OS side
effects. Behavior is unchanged (load+store is equivalent to the previous swap on
the serially-invoked callback).

Adds 8 unit tests covering the subtle anti-spam logic that previously had zero
coverage: notify once per newly-parked failure, stay silent at the same level,
re-notify only when the count rises, re-arm after the journal clears, and
critically that Syncing preserves the watermark (resetting it would re-arm the
toast every poll — the exact regression commit 5b6455c fixed).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: be30a7d8f988
The cargo-linux job already runs `cargo llvm-cov --workspace`, so desktop-lcov.info
already contains coverage for all crates/** (incl. the cipherbox-fuse and
cipherbox-sdk unit tests). It was only uploaded under the `desktop` flag, whose
path filter (apps/desktop/src-tauri/src/**) discards every crate line — so the
journal/replay/queue logic and its tests were invisible to codecov.

Add a `rust` flag (paths: crates/**) and re-upload the same lcov under it, plus an
informational project status. No new test run; this just stops dropping coverage
we already collect.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: 9f3624db943f
Addresses a CodeRabbit review on PR #487. is_connected() gates the tray Logout
item; treating WriteParked as disconnected locked an authenticated user with
parked uploads out of Logout. Include WriteParked in is_connected() and update
the unit test. Open/Sync stay gated by is_mounted/is_syncable (unchanged).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: 9396fb86eeb7
Addresses a CodeRabbit review on PR #487. spawn_sync_daemon overwrote
state.sync_trigger and tokio::spawn'd unconditionally, so calling it twice
(re-auth, or the start_sync_daemon IPC command) left two daemon loops polling —
doubling every tray status update and WriteParked notification, which undermines
the D-10 anti-spam work in this phase. Add an idempotency guard: if a daemon is
already running, skip. The daemon reads auth/vault state from the shared KeyState
each poll, so a single long-lived daemon adapts across re-auth without a restart.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: cee7722d6b23
Two CodeRabbit nitpicks on PR #487:

- crates/fuse/src/lib.rs: rename the BFS counter `depth`/`MAX_RESOLVE_DEPTH` to
  `nodes_visited`/`MAX_RESOLVE_NODES`. It counts nodes visited, not tree depth;
  the cap bounds total network round trips. No behavior change.
- crates/sdk/src/queue.rs: the journal_no_plaintext test defined a plaintext
  probe that was never inserted, making that assertion a no-op. Replace it with a
  meaningful check that the raw key bytes (b"wrappedkey") never appear — verifying
  keys are stored hex-encoded, matching the test's "no raw key bytes" intent.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: d3f60ddea1d6
@FSM1

FSM1 commented Jun 14, 2026

Copy link
Copy Markdown
Owner Author

Addressed CodeRabbit review (#pullrequestreview-4492966857)

All three out-of-diff / nitpick comments handled:

Outside-diff range (🟠 Major) — duplicate sync-daemon spawns → fixed in 29be9d1.
spawn_sync_daemon now returns early if a daemon is already running (idempotency guard). The daemon reads auth/vault state from the shared KeyState each poll, so a single long-lived daemon adapts across re-auth without a restart — avoiding two loops that would double the very WriteParked notifications this phase works to dedupe (D-10).

Nitpick — depth misnamed in resolve_folder_key → fixed in c65b797.
Renamed depth / MAX_RESOLVE_DEPTHnodes_visited / MAX_RESOLVE_NODES. It counts nodes visited across the BFS (bounding total network round trips), not tree depth. No behavior change.

Nitpick — journal_no_plaintext no-op probe → fixed in c65b797.
The probe string was never inserted into the entry, so that assertion was trivially true. Replaced it with a meaningful check that the raw key bytes (b"wrappedkey") never appear in the serialized journal — verifying keys are stored hex-encoded, matching the test's "no raw key bytes" intent.

Verified: cargo test -p cipherbox-fuse -p cipherbox-sdk (40 + 43 pass) and cargo check -p cipherbox-desktop clean.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
codecov.yml (1)

65-72: ⚡ Quick win

Add rust to the default aggregate flags

You added a dedicated rust flag/status, but coverage.status.project.default.flags still excludes it. Include rust there so the aggregate project signal reflects crates coverage too.

Proposed fix
       default:
         target: auto
         threshold: 6%
         flags:
           - api
           - crypto
           - core
           - api-client
           - sdk-core
           - sdk
+          - rust

Based on learnings, this repo’s aggregate Codecov check should include all relevant flags while retaining per-component statuses.

Also applies to: 108-111

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codecov.yml` around lines 65 - 72, The rust flag has been added to the crates
configuration section, but it is not included in the
coverage.status.project.default.flags list, which means the aggregate project
coverage signal does not include rust crate coverage. Add rust to the
default.flags array in the coverage.status.project section to ensure the
aggregate codecov check reflects coverage from all relevant components including
the rust crates.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 670-676: The codecov/codecov-action references in the workflow are
using an unpinned version tag (`@v7`) which violates immutable action policy. Pin
all three instances of codecov/codecov-action@v7 (at lines 335, 660, and 671) to
a specific commit SHA instead of the version tag. Replace `@v7` with
@[full-commit-sha] for each occurrence, ensuring all three use the same SHA
value for consistency and to reduce supply-chain risk.

In `@apps/desktop/src-tauri/src/commands/sync.rs`:
- Around line 33-38: The current implementation has a race condition where the
check for whether the sync daemon is already running uses a read lock, allowing
multiple callers to pass the check before any can set the value, potentially
spawning multiple daemons. Additionally, poisoned lock errors are being silently
ignored. Replace the read lock check with a single write lock section that
atomically checks if the sync daemon sender exists and sets it in one operation;
if the write lock acquisition fails (including poisoned lock cases), return an
error rather than ignoring it. This ensures only one daemon is spawned across
concurrent calls to the sync command function.

---

Nitpick comments:
In `@codecov.yml`:
- Around line 65-72: The rust flag has been added to the crates configuration
section, but it is not included in the coverage.status.project.default.flags
list, which means the aggregate project coverage signal does not include rust
crate coverage. Add rust to the default.flags array in the
coverage.status.project section to ensure the aggregate codecov check reflects
coverage from all relevant components including the rust crates.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26757bdb-79a4-4e2f-bfa5-6ab192489873

📥 Commits

Reviewing files that changed from the base of the PR and between c8137ed and c65b797.

📒 Files selected for processing (7)
  • .github/workflows/ci.yml
  • apps/desktop/src-tauri/src/commands/sync.rs
  • apps/desktop/src-tauri/src/sync/mod.rs
  • apps/desktop/src-tauri/src/tray/status.rs
  • codecov.yml
  • crates/fuse/src/lib.rs
  • crates/sdk/src/queue.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/desktop/src-tauri/src/sync/mod.rs
  • crates/sdk/src/queue.rs

Comment thread .github/workflows/ci.yml
Comment thread apps/desktop/src-tauri/src/commands/sync.rs Outdated
FSM1 and others added 3 commits June 14, 2026 16:27
PR #487 codecov reports 50.1% patch coverage (638 lines missing, but the
patch status is informational/non-blocking). Capture a tiered follow-up:
Tier 1 quick unit-test wins (sdk sync.rs redaction helpers, queue.rs and
sync/mod.rs residual branches), Tier 2 trait-seam harness for the FUSE
callback + replay paths (blocked on the two write-path consolidation
todos), Tier 3 the Tauri glue that is realistically UAT-only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: fd2e0cd57614
The idempotency guard added in 29be9d1 checked sync_trigger under a read
lock and then stored the sender under a separate write lock — a TOCTOU race
where two concurrent callers could both pass the check and each spawn a
daemon (the second overwriting the first's sender). It also silently ignored
a poisoned lock, which would spawn a daemon whose sender the tray "Sync Now"
button could never reach.

Collapse check-and-set into one write-lock critical section and surface a
poisoned lock as an error. Move the "Starting" info log after the guard so it
only fires when a daemon is actually spawned. Addresses CodeRabbit on PR #487.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: 970bd0c631a8
Deferred from PR #487 (CodeRabbit/zizmor flagged the codecov refs). Capture
the repo-wide follow-up: pin all 111 third-party action refs across the 14
workflow files to immutable commit SHAs, paired with Dependabot
github-actions updates so pins don't go stale, plus an optional zizmor CI
gate covering the related excessive-permissions finding.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Entire-Checkpoint: 244dacd08641
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release:api:feat Minor version bump (new feature) for api release:cipherbox-fuse:feat Minor version bump (new feature) for cipherbox-fuse release:cipherbox-sdk:feat Minor version bump (new feature) for cipherbox-sdk release:desktop:feat Minor version bump (new feature) for desktop release:sdk:fix Patch version bump (bug fix) for sdk release:sdk-core:fix Patch version bump (bug fix) for sdk-core release:tee-worker:fix Patch version bump (bug fix) for tee-worker release:web:feat Minor version bump (new feature) for web

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant