Skip to content

Feat/storage#4

Merged
PhantomInTheWire merged 4 commits into
masterfrom
feat/storage
Mar 19, 2026
Merged

Feat/storage#4
PhantomInTheWire merged 4 commits into
masterfrom
feat/storage

Conversation

@PhantomInTheWire

@PhantomInTheWire PhantomInTheWire commented Mar 18, 2026

Copy link
Copy Markdown
Owner

Open with Devin

Summary by CodeRabbit

  • New Features

    • Added delete-by-filter API to remove documents matching a filter.
    • CLI: stricter collection-name parsing and safer create/stats arguments.
  • Bug Fixes

    • Improved WAL recovery to fail gracefully on corrupted/oversized WALs.
    • Prevent collection creation when directory already exists.
    • Reinserting a previously deleted document now survives reopen.
  • Tests

    • Added tests for delete-by-filter, storage id-map reopen behavior, and WAL recovery.

@coderabbitai

coderabbitai Bot commented Mar 18, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces primitive id/deletion storage with IdMap/DeleteStore and RecordState; moves filter evaluation into garuda-meta; adds Collection::delete_by_filter and WriteMode; updates snapshot I/O, codec/WAL checksum usage, storage layout, CLI typing, and tests to exercise new semantics.

Changes

Cohort / File(s) Summary
Core state & WAL
crates/garuda-engine/src/state.rs, crates/garuda-engine/src/bootstrap.rs, crates/garuda-engine/src/persistence.rs, crates/garuda-segment/src/wal.rs
Replace HashMap/HashSet with IdMap/DeleteStore; introduce WriteMode (Insert/Upsert); treat WalOp::Update not-found as no-op; wire WAL checksum from codec crate.
Collection API & filter flow
crates/garuda-engine/src/lib.rs, crates/garuda-engine/src/query.rs, crates/garuda-engine/src/filter.rs
Add Collection::delete_by_filter, collect_matching_doc_ids, and apply_delete_batch; delegate runtime filter evaluation to garuda-meta; add parse_required_filter.
Metadata & evaluator
crates/garuda-meta/src/lib.rs
Add public IdMap and DeleteStore types and a crate-level evaluate_filter implementation with comparison helpers.
Segment model & codec
crates/garuda-segment/src/lib.rs, crates/garuda-segment/src/codec.rs
Introduce RecordState enum (Live/Deleted) replacing deleted: bool; serialize RecordState as a one-byte tag; expose crate checksum helper.
Storage & snapshot I/O
crates/garuda-storage/src/snapshots.rs, crates/garuda-storage/src/layout.rs
Snapshot writers now accept generic iterators ((String,u64) and u64) instead of concrete maps/sets; ensure_new_collection_dir uses create_dir, syncs parent, and performs cleanup on partial failures.
Validation & schema
crates/garuda-engine/src/validation.rs, crates/garuda-engine/src/schema.rs
Promoted validate_scalar_value to pub(crate) and moved its implementation; schema validation now reuses the crate-visible validator.
CLI & top-level
garuda/src/main.rs, garuda/tests/test_cli_contract.rs
Make Cli.command required; parse CollectionName at CLI level; use NonZeroUsize for dimension; remove Doctor command and related test.
Tests & WAL robustness
crates/garuda-engine/tests/*, crates/garuda-engine/tests/test_storage_id_map.rs
Add tests for delete-by-filter, reinsert-after-delete surviving reopen, collection creation AlreadyExists, enhanced WAL corruption and oversized-payload handling; introduce WAL checksum helper in tests.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant API as Collection API
    participant Parser as Query Parser
    participant Eval as garuda-meta::Evaluator
    participant State as CollectionState
    participant Index as DeleteStore/Index

    Client->>API: delete_by_filter(raw_filter)
    API->>Parser: parse_required_filter(raw_filter, schema)
    Parser-->>API: FilterExpr

    API->>State: all_live_records()
    loop per live record
        State->>Eval: evaluate_filter(expr, record.fields)
        Eval-->>State: bool (match?)
        alt match
            State-->>API: collect doc_id
        end
    end

    API->>State: apply_delete_batch(collected_doc_ids)
    loop per doc_id
        State->>State: append WalOp::Delete
        State->>State: mark record.state = Deleted
        State->>Index: delete_store.insert(internal_id)
    end

    State-->>API: WriteResults
    API-->>Client: Ok / Err
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly Related PRs

Poem

🐰 I nudge new maps where old sets used to lie,
RecordState hops — live or waved goodbye.
Filters moved to meta, I twitch my nose,
Deletes tracked in stores where tidy data grows.
Tests nibble carrots — the DB hops and sighs.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feat/storage' is vague and does not clearly convey the specific changes made in this pull request. Provide a more descriptive title that summarizes the main changes, such as 'Refactor storage abstractions to use IdMap and DeleteStore' or 'Replace HashMap/HashSet with typed storage abstractions'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/storage
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

devin-ai-integration[bot]

This comment was marked as resolved.

@chatgpt-codex-connector

Copy link
Copy Markdown

💡 Codex Review

let id_map = IdMap::from(read_id_map_snapshot(&path, manifest.id_map_snapshot_id)?);
let delete_store = DeleteStore::from(read_delete_snapshot(&path, manifest.delete_snapshot_id)?);

P1 Badge Fall back to rebuilding indexes when a snapshot sidecar is missing

load_collection_state() now hard-fails if idmap.N or del.N cannot be read, even though those files are derived from the segment files and CollectionState::rebuild_indexes() can reconstruct them. In the previous flow we tolerated missing/corrupt sidecars and rebuilt from segments; with this change, deleting or corrupting either snapshot makes open_collection() fail even when the manifest and all segment data are still intact.


rename_path(&temp_path, path)?;
sync_directory(parent)?;

P1 Badge Don't roll back writes after the WAL rename has already succeeded

write_file_atomically() can still return an error after rename_path() has already replaced the WAL, because sync_directory(parent) runs afterward. Collection::apply_doc_write_batch() and Collection::delete() treat any append_wal_ops() error as an uncommitted write and restore the in-memory snapshot, so on filesystems where directory fsync is unsupported or transiently fails the API reports the write as failed even though the WAL entry is already on disk; after restart, replay_writing_segment_wal() will apply that supposedly failed write anyway.


let temp_path = temp_path(path);
write_temp_file(&temp_path, bytes)?;
rename_path(&temp_path, path)?;
sync_directory(parent)?;

P2 Badge Fsync the collection directory when creating a new segment folder

write_segment() now writes to .../<segment_id>/data.seg, and this helper creates <segment_id>/ on demand. When that directory is new we only fsync the new segment directory itself, never its parent collection directory, so a power loss right after the first flush or a rollover can lose the directory entry while the manifest already points at it. Reopening then fails with a missing segment path even though the checkpoint returned success.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/garuda-engine/src/bootstrap.rs (1)

46-63: ⚠️ Potential issue | 🟠 Major

Rebuild the derived stores after WAL replay.

id_map and delete_store are restored from snapshots here, but CollectionState::apply_wal_op in crates/garuda-engine/src/state.rs Lines 29-31 can skip redundant WAL entries without calling finish_mutation(). If the loaded writing segment already contains mutations newer than the snapshots, recovery returns stale indexes even though the segment data is present.

🩹 Proposed fix
     let mut state = CollectionState {
         path,
         manifest,
         persisted_segments,
         writing_segment,
         id_map,
         delete_store,
     };
     replay_writing_segment_wal(&mut state)?;
+    state.rebuild_indexes();
 
     Ok(state)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-engine/src/bootstrap.rs` around lines 46 - 63, The code
restores id_map and delete_store from snapshots before calling
replay_writing_segment_wal, which can leave indexes stale if the writing segment
contains newer mutations; after replay_writing_segment_wal(&mut state) you must
rebuild the derived stores so they reflect replayed WAL entries: invoke the
appropriate rebuild routines (e.g. call IdMap::rebuild_from_segments or a
similar function using state.persisted_segments and state.writing_segment to
recreate id_map, and call DeleteStore::rebuild_from_segments or equivalent to
recreate delete_store) or iterate segments to reapply mutations and call
finish_mutation() where needed so CollectionState.id_map and
CollectionState.delete_store are reconstructed after WAL replay (refer to
CollectionState, apply_wal_op, replay_writing_segment_wal, finish_mutation,
IdMap, and DeleteStore).
🧹 Nitpick comments (4)
crates/garuda-engine/tests/test_collection_recovery_crash_semantics.rs (1)

229-229: Avoid hardcoding WAL segment directory "0" in tests

Using "0" directly makes the tests brittle if the writing segment id changes. Prefer deriving the path from WRITING_SEGMENT_ID to keep intent and behavior aligned with engine constants.

♻️ Proposed refactor
-    let wal_path = collection.path().join("0").join("data.wal");
+    let wal_path = collection
+        .path()
+        .join(WRITING_SEGMENT_ID.to_string())
+        .join("data.wal");

Also applies to: 251-251

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-engine/tests/test_collection_recovery_crash_semantics.rs` at
line 229, Avoid hardcoding the "0" WAL segment directory: replace the literal
"0" used when building wal_path (and the other occurrence) with the engine
constant WRITING_SEGMENT_ID converted to a string (e.g.
WRITING_SEGMENT_ID.to_string()) and ensure WRITING_SEGMENT_ID is imported into
the test so the path is derived from the canonical constant rather than a
hardcoded value.
crates/garuda-storage/src/snapshots.rs (1)

46-49: Canonicalize delete snapshot payload by removing duplicates

After widening input on Line 46, duplicate IDs can now be encoded. Since delete state is logically set-based, deduping after sort keeps snapshot bytes canonical and avoids unnecessary growth/churn.

Suggested fix
 pub fn write_delete_snapshot(
     root: &Path,
     snapshot_id: SnapshotId,
     deleted_doc_ids: impl IntoIterator<Item = u64>,
 ) -> Result<(), Status> {
     let mut ids = deleted_doc_ids.into_iter().collect::<Vec<_>>();
     ids.sort_unstable();
+    ids.dedup();

     let bytes = encode_delete_snapshot(&ids)?;
     write_file_atomically(&delete_snapshot_path(root, snapshot_id), &bytes)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-storage/src/snapshots.rs` around lines 46 - 49, The collected
IDs from deleted_doc_ids can contain duplicates; after collecting into ids and
calling ids.sort_unstable() you should canonicalize by removing duplicates
(e.g., call ids.dedup()) so the delete snapshot is set-like and produces stable
bytes; update the code around the ids variable (the collection, sort_unstable
call and subsequent use) to perform deduplication after sorting.
crates/garuda-storage/src/layout.rs (1)

148-155: Consider adding a defensive check for empty parent path.

path.parent() returns Some("") (not None) for relative single-component paths like "foo". While this is unlikely given that callers always use properly constructed absolute paths, a defensive check would make the function more robust if used in other contexts later.

🛡️ Optional defensive fix
 fn sync_parent_directory(path: &Path) -> Result<(), Status> {
     let Some(parent) = path.parent() else {
         return Ok(());
     };
+    if parent.as_os_str().is_empty() {
+        return Ok(());
+    }

     sync_directory(parent)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-storage/src/layout.rs` around lines 148 - 155, The current
sync_parent_directory function treats path.parent() == Some(parent) as always
valid, but parent can be an empty path for single-component inputs; update
sync_parent_directory to defensively check for an empty parent (e.g.
parent.as_os_str().is_empty() or parent.components().next().is_none()) and
return Ok(()) in that case before calling sync_directory(parent); keep
references to sync_parent_directory, sync_directory and the path.parent() call
so the check is added right after obtaining parent.
crates/garuda-engine/src/lib.rs (1)

224-243: Consider documenting the TOCTOU behavior.

Between collecting matching IDs (line 231) and executing deletions (line 234), the read lock is dropped (line 232). This allows concurrent modifications where:

  1. Another thread could delete the same documents → NotFound errors.
  2. Another thread could insert new matching documents → those are missed.

The current behavior (propagating the first error) is acceptable for best-effort deletion. However, if atomic "delete all matching" semantics are required, consider holding a write lock throughout or documenting this as eventual-consistency behavior.

📝 Optional: Add a doc comment clarifying semantics
+    /// Deletes all documents matching the given filter expression.
+    ///
+    /// Note: This operation is not strictly atomic. Documents are matched under a
+    /// read lock, then deleted under a write lock. Concurrent modifications between
+    /// these phases may cause some matched documents to be already deleted (returning
+    /// an error) or new matching documents to be missed.
     pub fn delete_by_filter(&self, raw_filter: &str) -> Result<(), Status> {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-engine/src/lib.rs` around lines 224 - 243, The delete_by_filter
function has a TOCTOU race: it calls collect_matching_doc_ids after acquiring a
read via read_state then drops the state before calling delete(ids), so
concurrent modifications (via delete or inserts) can cause NotFound errors or
missed docs; either enforce atomicity by holding a write lock for the whole
operation (acquire write state instead of read_state and do not drop until after
delete) or explicitly document the current eventual-consistency behavior in a
doc comment on delete_by_filter mentioning collect_matching_doc_ids, delete,
read_state and the dropped state so callers know deletions are best-effort and
may see NotFound or miss concurrently-inserted matches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/garuda-engine/src/state.rs`:
- Around line 34-35: The upsert replay path uses insert_doc(…,
WriteMode::Upsert) which applies schema defaults/normalization but the
redundancy check (live_doc_matches) compares against the raw payload, causing
non-idempotent replays when only defaulted fields differ; to fix, normalize the
incoming WalOp::Upsert document (apply the same schema defaults/normalization
used by insert_doc) before calling live_doc_matches so the comparison is between
normalized forms, and ensure the same change is applied for any other places
that perform upsert redundancy checks (e.g., the other live_doc_matches usage in
the replay branch).

In `@crates/garuda-storage/src/snapshots.rs`:
- Around line 19-22: The new iterator-based API can allow duplicate document IDs
to be written; after collecting and sorting ordered_entries (the entries
variable -> ordered_entries Vec), scan the sorted list for adjacent entries with
identical keys and reject the snapshot by returning an
Err(Status::invalid_argument(...)) that includes the duplicate key before any
encoding/writing occurs (i.e., before the code path that uses HashMap::insert).
This preserves the previous uniqueness guarantee and prevents silent collapse on
read; include the key in the error message for debugging.

---

Outside diff comments:
In `@crates/garuda-engine/src/bootstrap.rs`:
- Around line 46-63: The code restores id_map and delete_store from snapshots
before calling replay_writing_segment_wal, which can leave indexes stale if the
writing segment contains newer mutations; after replay_writing_segment_wal(&mut
state) you must rebuild the derived stores so they reflect replayed WAL entries:
invoke the appropriate rebuild routines (e.g. call IdMap::rebuild_from_segments
or a similar function using state.persisted_segments and state.writing_segment
to recreate id_map, and call DeleteStore::rebuild_from_segments or equivalent to
recreate delete_store) or iterate segments to reapply mutations and call
finish_mutation() where needed so CollectionState.id_map and
CollectionState.delete_store are reconstructed after WAL replay (refer to
CollectionState, apply_wal_op, replay_writing_segment_wal, finish_mutation,
IdMap, and DeleteStore).

---

Nitpick comments:
In `@crates/garuda-engine/src/lib.rs`:
- Around line 224-243: The delete_by_filter function has a TOCTOU race: it calls
collect_matching_doc_ids after acquiring a read via read_state then drops the
state before calling delete(ids), so concurrent modifications (via delete or
inserts) can cause NotFound errors or missed docs; either enforce atomicity by
holding a write lock for the whole operation (acquire write state instead of
read_state and do not drop until after delete) or explicitly document the
current eventual-consistency behavior in a doc comment on delete_by_filter
mentioning collect_matching_doc_ids, delete, read_state and the dropped state so
callers know deletions are best-effort and may see NotFound or miss
concurrently-inserted matches.

In `@crates/garuda-engine/tests/test_collection_recovery_crash_semantics.rs`:
- Line 229: Avoid hardcoding the "0" WAL segment directory: replace the literal
"0" used when building wal_path (and the other occurrence) with the engine
constant WRITING_SEGMENT_ID converted to a string (e.g.
WRITING_SEGMENT_ID.to_string()) and ensure WRITING_SEGMENT_ID is imported into
the test so the path is derived from the canonical constant rather than a
hardcoded value.

In `@crates/garuda-storage/src/layout.rs`:
- Around line 148-155: The current sync_parent_directory function treats
path.parent() == Some(parent) as always valid, but parent can be an empty path
for single-component inputs; update sync_parent_directory to defensively check
for an empty parent (e.g. parent.as_os_str().is_empty() or
parent.components().next().is_none()) and return Ok(()) in that case before
calling sync_directory(parent); keep references to sync_parent_directory,
sync_directory and the path.parent() call so the check is added right after
obtaining parent.

In `@crates/garuda-storage/src/snapshots.rs`:
- Around line 46-49: The collected IDs from deleted_doc_ids can contain
duplicates; after collecting into ids and calling ids.sort_unstable() you should
canonicalize by removing duplicates (e.g., call ids.dedup()) so the delete
snapshot is set-like and produces stable bytes; update the code around the ids
variable (the collection, sort_unstable call and subsequent use) to perform
deduplication after sorting.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 53e1872f-ffb7-4174-82fc-6b6656dc3fd6

📥 Commits

Reviewing files that changed from the base of the PR and between f8040e8 and f72bb04.

📒 Files selected for processing (14)
  • crates/garuda-engine/src/bootstrap.rs
  • crates/garuda-engine/src/filter.rs
  • crates/garuda-engine/src/lib.rs
  • crates/garuda-engine/src/persistence.rs
  • crates/garuda-engine/src/state.rs
  • crates/garuda-engine/tests/test_collection_create_and_open_validation.rs
  • crates/garuda-engine/tests/test_collection_filter_behavior.rs
  • crates/garuda-engine/tests/test_collection_recovery_crash_semantics.rs
  • crates/garuda-engine/tests/test_storage_id_map.rs
  • crates/garuda-meta/src/lib.rs
  • crates/garuda-segment/src/codec.rs
  • crates/garuda-segment/src/lib.rs
  • crates/garuda-storage/src/layout.rs
  • crates/garuda-storage/src/snapshots.rs
💤 Files with no reviewable changes (1)
  • crates/garuda-engine/src/filter.rs

Comment on lines +34 to +35
WalOp::Insert(doc) => self.insert_doc(doc.clone(), WriteMode::Insert),
WalOp::Upsert(doc) => self.insert_doc(doc.clone(), WriteMode::Upsert),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep upsert replay idempotency aligned with the normalized write path.

insert_doc() applies schema defaults before it decides what gets stored, but the upsert redundancy check still compares against the raw payload. A replayed upsert that only differs by defaulted fields will delete the current live record and append the same logical document again.

🩹 One way to fix this is to normalize before `live_doc_matches()` compares
 fn live_doc_matches(&self, doc: &Doc) -> bool {
-    let Some(record) = self.find_live_record(&doc.id) else {
+    let mut normalized = doc.clone();
+    apply_schema_defaults(&self.manifest.schema, &mut normalized);
+
+    let Some(record) = self.find_live_record(&normalized.id) else {
         return false;
     };
 
-    record.doc == *doc
+    record.doc == normalized
 }

Also applies to: 62-76

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-engine/src/state.rs` around lines 34 - 35, The upsert replay
path uses insert_doc(…, WriteMode::Upsert) which applies schema
defaults/normalization but the redundancy check (live_doc_matches) compares
against the raw payload, causing non-idempotent replays when only defaulted
fields differ; to fix, normalize the incoming WalOp::Upsert document (apply the
same schema defaults/normalization used by insert_doc) before calling
live_doc_matches so the comparison is between normalized forms, and ensure the
same change is applied for any other places that perform upsert redundancy
checks (e.g., the other live_doc_matches usage in the replay branch).

Comment on lines +19 to 22
entries: impl IntoIterator<Item = (String, u64)>,
) -> Result<(), Status> {
let mut ordered_entries = entries
.iter()
.map(|(doc_id, internal_doc_id)| (doc_id.as_str().to_string(), *internal_doc_id))
.collect::<Vec<_>>();
let mut ordered_entries = entries.into_iter().collect::<Vec<_>>();
ordered_entries.sort_by(|lhs, rhs| lhs.0.cmp(&rhs.0));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate duplicate doc IDs before encoding id-map snapshots

Line 19 removes the uniqueness guarantee that &HashMap used to provide. With the new iterator input, duplicate keys can be written, then silently collapsed during read at Line 37 (HashMap::insert), which can drop mappings without error.

Suggested fix
 pub fn write_id_map_snapshot(
     root: &Path,
     snapshot_id: SnapshotId,
     entries: impl IntoIterator<Item = (String, u64)>,
 ) -> Result<(), Status> {
     let mut ordered_entries = entries.into_iter().collect::<Vec<_>>();
     ordered_entries.sort_by(|lhs, rhs| lhs.0.cmp(&rhs.0));
+    for pair in ordered_entries.windows(2) {
+        if pair[0].0 == pair[1].0 {
+            return Err(Status::err(
+                StatusCode::Internal,
+                format!("duplicate doc_id in id map snapshot input: {}", pair[0].0),
+            ));
+        }
+    }

     let bytes = encode_id_map(&ordered_entries)?;
     write_file_atomically(&id_map_snapshot_path(root, snapshot_id), &bytes)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
entries: impl IntoIterator<Item = (String, u64)>,
) -> Result<(), Status> {
let mut ordered_entries = entries
.iter()
.map(|(doc_id, internal_doc_id)| (doc_id.as_str().to_string(), *internal_doc_id))
.collect::<Vec<_>>();
let mut ordered_entries = entries.into_iter().collect::<Vec<_>>();
ordered_entries.sort_by(|lhs, rhs| lhs.0.cmp(&rhs.0));
pub fn write_id_map_snapshot(
root: &Path,
snapshot_id: SnapshotId,
entries: impl IntoIterator<Item = (String, u64)>,
) -> Result<(), Status> {
let mut ordered_entries = entries.into_iter().collect::<Vec<_>>();
ordered_entries.sort_by(|lhs, rhs| lhs.0.cmp(&rhs.0));
for pair in ordered_entries.windows(2) {
if pair[0].0 == pair[1].0 {
return Err(Status::err(
StatusCode::Internal,
format!("duplicate doc_id in id map snapshot input: {}", pair[0].0),
));
}
}
let bytes = encode_id_map(&ordered_entries)?;
write_file_atomically(&id_map_snapshot_path(root, snapshot_id), &bytes)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-storage/src/snapshots.rs` around lines 19 - 22, The new
iterator-based API can allow duplicate document IDs to be written; after
collecting and sorting ordered_entries (the entries variable -> ordered_entries
Vec), scan the sorted list for adjacent entries with identical keys and reject
the snapshot by returning an Err(Status::invalid_argument(...)) that includes
the duplicate key before any encoding/writing occurs (i.e., before the code path
that uses HashMap::insert). This preserves the previous uniqueness guarantee and
prevents silent collapse on read; include the key in the error message for
debugging.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e230ba6b10

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/garuda-engine/src/lib.rs Outdated
Comment on lines +231 to +234
let ids = collect_matching_doc_ids(&state, &filter);
drop(state);

for result in self.delete(ids) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Make delete_by_filter atomic with respect to writers

When the collection has concurrent writers, this two-phase implementation can delete the wrong document set: it snapshots matching DocIds under a read lock, drops the lock, and then resolves those ids again inside delete(). If another thread upserts/reinserts the same id in between, delete_by_filter will delete the replacement row even when it no longer matches the filter, and it can also miss rows that started matching after the read phase. Because Collection is cloneable and concurrent access is already exercised elsewhere in this crate, callers can observe incorrect deletes in production.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/garuda-engine/src/state.rs (1)

242-248: ⚠️ Potential issue | 🟠 Major

Normalize before upsert redundancy comparison.

Line 242 compares the raw incoming doc, while insert_doc normalizes with schema defaults first. This still makes upsert WAL replay non-idempotent when only defaulted fields differ (downstream at Line 54).

Suggested fix
 fn live_doc_matches(&self, doc: &Doc) -> bool {
-    let Some(record) = self.find_live_record(&doc.id) else {
+    let mut normalized = doc.clone();
+    apply_schema_defaults(&self.manifest.schema, &mut normalized);
+
+    let Some(record) = self.find_live_record(&normalized.id) else {
         return false;
     };
 
-    record.doc == *doc
+    record.doc == normalized
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-engine/src/state.rs` around lines 242 - 248, live_doc_matches
currently compares the raw incoming Doc against the stored record, causing
non-idempotent upsert replay when schema defaults differ; change
live_doc_matches to normalize the incoming doc the same way insert_doc does
(apply schema defaults/normalization) before comparing to record.doc by calling
the normalization routine used in insert_doc (or reuse the same helper used
during upsert/insert) so comparisons are made on the canonical, defaulted form;
ensure you reference live_doc_matches, insert_doc, and find_live_record when
locating where to apply the normalization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/garuda-engine/src/state.rs`:
- Around line 242-248: live_doc_matches currently compares the raw incoming Doc
against the stored record, causing non-idempotent upsert replay when schema
defaults differ; change live_doc_matches to normalize the incoming doc the same
way insert_doc does (apply schema defaults/normalization) before comparing to
record.doc by calling the normalization routine used in insert_doc (or reuse the
same helper used during upsert/insert) so comparisons are made on the canonical,
defaulted form; ensure you reference live_doc_matches, insert_doc, and
find_live_record when locating where to apply the normalization.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4e78aa6d-46b3-4236-93f8-29c704dbc743

📥 Commits

Reviewing files that changed from the base of the PR and between f72bb04 and b874554.

📒 Files selected for processing (15)
  • crates/garuda-engine/src/bootstrap.rs
  • crates/garuda-engine/src/filter.rs
  • crates/garuda-engine/src/lib.rs
  • crates/garuda-engine/src/persistence.rs
  • crates/garuda-engine/src/state.rs
  • crates/garuda-engine/tests/test_collection_create_and_open_validation.rs
  • crates/garuda-engine/tests/test_collection_filter_behavior.rs
  • crates/garuda-engine/tests/test_collection_recovery_crash_semantics.rs
  • crates/garuda-engine/tests/test_storage_id_map.rs
  • crates/garuda-meta/src/lib.rs
  • crates/garuda-segment/src/codec.rs
  • crates/garuda-segment/src/lib.rs
  • crates/garuda-storage/src/layout.rs
  • crates/garuda-storage/src/snapshots.rs
  • garuda/src/main.rs
💤 Files with no reviewable changes (1)
  • crates/garuda-engine/src/filter.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/garuda-engine/tests/test_collection_filter_behavior.rs
🚧 Files skipped from review as they are similar to previous changes (8)
  • crates/garuda-engine/tests/test_collection_create_and_open_validation.rs
  • crates/garuda-engine/src/persistence.rs
  • crates/garuda-storage/src/layout.rs
  • crates/garuda-segment/src/lib.rs
  • crates/garuda-segment/src/codec.rs
  • crates/garuda-engine/tests/test_storage_id_map.rs
  • crates/garuda-engine/src/lib.rs
  • crates/garuda-storage/src/snapshots.rs

devin-ai-integration[bot]

This comment was marked as resolved.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b874554a34

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread garuda/src/main.rs
Comment on lines +45 to 47
Commands::Init => {
println!("{}", root.display());
Ok(())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore root creation/validation in init

Because Database::open(&root) is now only called inside create and stats, the init branch no longer creates or validates the workspace root at all; it just prints the path and exits successfully. On a nonexistent or unwritable --root, garuda init now reports success even though the workspace was never initialized, so any automation using init as a provisioning/readiness step will get a false positive and fail later on the first real command.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
garuda/src/main.rs (2)

44-45: Consider centralizing DB open + contextual error text.

Database::open(&root).map_err(|status| status.message)? is duplicated and loses operation context. A small helper would reduce repetition and make failures clearer.

♻️ Proposed refactor
 fn main() -> Result<(), String> {
     let Cli { root, command } = Cli::parse();

     match command {
@@
         Commands::Create { name, dimension } => {
-            let db = Database::open(&root).map_err(|status| status.message)?;
+            let db = open_db(&root)?;
@@
         Commands::Stats { name } => {
-            let db = Database::open(&root).map_err(|status| status.message)?;
+            let db = open_db(&root)?;
             let collection = db.open_collection(&name).map_err(|status| status.message)?;
@@
 }
+
+fn open_db(root: &PathBuf) -> Result<Database, String> {
+    Database::open(root)
+        .map_err(|status| format!("failed to open database at {}: {}", root.display(), status.message))
+}

Also applies to: 67-68

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@garuda/src/main.rs` around lines 44 - 45, Centralize the repeated
Database::open(&root).map_err(|status| status.message)? pattern by adding a
small helper (e.g., open_database_with_context or Database::open_with_context)
that calls Database::open(&root), maps the error to a more descriptive message
(including operation context like "opening DB at root" or the caller name) and
returns the Database or the mapped error; then replace the direct calls in the
places that currently call Database::open(&root).map_err(|status|
status.message)? (references: Database::open and the CollectionSchema
initialization sites) with calls to this new helper to avoid duplication and
provide contextual error text.

81-83: Consider returning Result for defensive programming.

field_name currently panics on invalid input, which is safe for the constant values it receives (PRIMARY_KEY_FIELD and VECTOR_FIELD). However, since the function is not exported and could be reused with runtime-provided input in the future, returning Result<FieldName, String> would prevent panics. This is optional but recommended if the function scope expands.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@garuda/src/main.rs` around lines 81 - 83, Change field_name to return a
Result<FieldName, String> instead of panicking: replace the current expect-based
implementation so it calls FieldName::parse(value) and maps the parse error into
a descriptive String (e.g. with map_err or to_string). Update all internal call
sites (e.g. places that pass PRIMARY_KEY_FIELD and VECTOR_FIELD into field_name)
to handle the Result (unwrap or propagate with ? where appropriate) so existing
behavior remains unchanged while preventing a hard panic if field_name is ever
called with runtime input.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@garuda/src/main.rs`:
- Around line 44-45: Centralize the repeated
Database::open(&root).map_err(|status| status.message)? pattern by adding a
small helper (e.g., open_database_with_context or Database::open_with_context)
that calls Database::open(&root), maps the error to a more descriptive message
(including operation context like "opening DB at root" or the caller name) and
returns the Database or the mapped error; then replace the direct calls in the
places that currently call Database::open(&root).map_err(|status|
status.message)? (references: Database::open and the CollectionSchema
initialization sites) with calls to this new helper to avoid duplication and
provide contextual error text.
- Around line 81-83: Change field_name to return a Result<FieldName, String>
instead of panicking: replace the current expect-based implementation so it
calls FieldName::parse(value) and maps the parse error into a descriptive String
(e.g. with map_err or to_string). Update all internal call sites (e.g. places
that pass PRIMARY_KEY_FIELD and VECTOR_FIELD into field_name) to handle the
Result (unwrap or propagate with ? where appropriate) so existing behavior
remains unchanged while preventing a hard panic if field_name is ever called
with runtime input.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e09b6dd7-e14e-4e3d-80e5-ce73870bb888

📥 Commits

Reviewing files that changed from the base of the PR and between b874554 and 005d669.

📒 Files selected for processing (2)
  • garuda/src/main.rs
  • garuda/tests/test_cli_contract.rs
💤 Files with no reviewable changes (1)
  • garuda/tests/test_cli_contract.rs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 005d669559

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread garuda/src/main.rs
root: PathBuf,
#[command(subcommand)]
command: Option<Commands>,
command: Commands,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore the doctor/no-arg health-check command

Making the subcommand mandatory here, together with removing Doctor from the enum below, turns both garuda-cli and garuda-cli doctor from successful health checks into clap usage errors. The deleted doctor_should_exit_successfully_for_a_healthy_workspace test in garuda/tests/test_cli_contract.rs shows this was part of the existing CLI contract, so any scripts or readiness probes using the current health-check entrypoint will start failing even when the workspace is fine.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

1 similar comment
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/garuda-engine/src/lib.rs (1)

359-374: Consider using iterator combinators for conciseness.

The logic is correct. For improved readability, you could simplify with iterator combinators:

♻️ Optional: Use iterator combinators
 fn collect_matching_doc_ids(
     state: &CollectionState,
     filter: &garuda_types::FilterExpr,
 ) -> Vec<DocId> {
-    let mut ids = Vec::new();
-
-    for record in state.all_live_records() {
-        if !evaluate_filter(filter, &record.doc.fields) {
-            continue;
-        }
-
-        ids.push(record.doc.id);
-    }
-
-    ids
+    state
+        .all_live_records()
+        .filter(|record| evaluate_filter(filter, &record.doc.fields))
+        .map(|record| record.doc.id.clone())
+        .collect()
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/garuda-engine/src/lib.rs` around lines 359 - 374, The function
collect_matching_doc_ids is using an explicit for loop to filter and collect doc
IDs; replace the loop with iterator combinators for conciseness by calling
state.all_live_records().into_iter().filter(|record| evaluate_filter(filter,
&record.doc.fields)).map(|record| record.doc.id).collect::<Vec<DocId>>() (or
equivalent using filter_map) so the logic stays the same but is expressed
declaratively; keep references to evaluate_filter, state.all_live_records(), and
record.doc.id when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/garuda-engine/src/lib.rs`:
- Around line 359-374: The function collect_matching_doc_ids is using an
explicit for loop to filter and collect doc IDs; replace the loop with iterator
combinators for conciseness by calling
state.all_live_records().into_iter().filter(|record| evaluate_filter(filter,
&record.doc.fields)).map(|record| record.doc.id).collect::<Vec<DocId>>() (or
equivalent using filter_map) so the logic stays the same but is expressed
declaratively; keep references to evaluate_filter, state.all_live_records(), and
record.doc.id when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1f2174e-5281-4f4b-b1c0-39d644dbad10

📥 Commits

Reviewing files that changed from the base of the PR and between 822b7b8 and 66239de.

⛔ Files ignored due to path filters (1)
  • scorecard.png is excluded by !**/*.png
📒 Files selected for processing (9)
  • crates/garuda-engine/src/lib.rs
  • crates/garuda-engine/src/query.rs
  • crates/garuda-engine/src/schema.rs
  • crates/garuda-engine/src/validation.rs
  • crates/garuda-segment/src/codec.rs
  • crates/garuda-segment/src/wal.rs
  • crates/garuda-storage/src/layout.rs
  • garuda/src/main.rs
  • garuda/tests/test_cli_contract.rs
✅ Files skipped from review due to trivial changes (1)
  • garuda/tests/test_cli_contract.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/garuda-storage/src/layout.rs
  • crates/garuda-segment/src/codec.rs
  • garuda/src/main.rs

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@PhantomInTheWire PhantomInTheWire merged commit 81bcf83 into master Mar 19, 2026
2 checks passed
@PhantomInTheWire PhantomInTheWire deleted the feat/storage branch March 19, 2026 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant