Skip to content

feat(multi-hop): implement 3-stage query decomposition and parallel retrieval#3

Merged
nehraa merged 11 commits into
mainfrom
feat/gemini-onnx-embedding-providers
Apr 22, 2026
Merged

feat(multi-hop): implement 3-stage query decomposition and parallel retrieval#3
nehraa merged 11 commits into
mainfrom
feat/gemini-onnx-embedding-providers

Conversation

@nehraa

@nehraa nehraa commented Apr 8, 2026

Copy link
Copy Markdown
Owner

Multi-Hop Retrieval with Query Decomposition

Problem

Current retrieval is single-node greedy. A broad query like "explain the mixnet key exchange and header-only forwarding" retrieves one closest node (usually a struct definition), and the LLM never sees related functions in other files.

Solution: 3-Stage Retrieval

Stage 1 — Decompose (LLM)

  • Heuristic classifier (shouldDecompose) scores questions on conjunctions, question marks, length, and keywords
  • LLM breaks complex questions into 2-5 focused sub-questions
  • Graceful fallback if LLM unavailable

Stage 2 — Parallel Retrieve (Vector + Graph)

  • Each sub-question retrieves independently via Promise.all
  • Vector search + lexical search + graph traversal per sub-question
  • Deduplication by nodeId (same function in multiple sub-questions → one result)

Stage 3 — Synthesize (LLM)

  • Context assembly with per-sub-question metadata and section boundaries
  • Synthesis prompt instructs LLM to address each sub-question then unify

New Files

File Purpose
src/retrieval/decompose.ts Heuristic classifier + LLM decomposition
src/retrieval/multi-hop.ts Parallel retrieve, deduplication, graph expansion
src/llm/multi-hop-context-builder.ts Multi-hop context assembly with budgeting
src/test/decompose.test.ts 9 tests for heuristic classifier
src/test/multi-hop.test.ts 3 tests for deduplication and metadata

Config

{
  "multiHop": {
    "enabled": true,
    "minQuestionLength": 25,
    "maxSubQuestions": 5,
    "expansionDepth": 1
  }
}

Usage

# CLI
coderag query "How does the mixnet handle key exchange and header-only forwarding?" --multi-hop

# HTTP  \nPOST /v1/query { "question": "...", "multiHop": true }

# MCP
query(question="...", multiHop=true)

Tests

  • 214 tests pass across 27 test files
  • TypeScript type-check clean

Summary by CodeRabbit

  • New Features

    • Multi-language repository analysis (Go, Python, C, C++, Rust, JavaScript/TypeScript)
    • HTTP REST API server with query, lookup, indexing, and health endpoints
    • Interactive setup wizard for initial configuration
    • Multi-hop retrieval to decompose complex questions into sub-questions
    • ONNX-based local embeddings (alternative to Gemini)
    • Git post-commit hook integration for automatic indexing
    • Configuration file support with environment variable overrides
  • Bug Fixes

    • Improved authentication security with timing-safe token comparison
    • Enhanced git hook command safety
  • Documentation

    • Updated with multi-language support and HTTP server details

nehraa and others added 10 commits April 6, 2026 18:43
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Replace analyzeTypeScriptRepo with analyzeRepo (codeflow-core 1.0.2)
- Remove ts-morph dependency; all AST analysis now via tree-sitter
- Extract sourceSpans and callSites directly from analyzeRepo result
- Expand exclusion list: .git, build, target, __pycache__, vendor, .venv
- Add multi-language keywords to package.json
- Update README to reflect supported languages
- Rewrite tests to verify tree-sitter source spans and line numbers
- Fix git-hook test assertion for shell-quoted config paths

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Add dist/bin/coderag.js entry point with proper process.argv handling
- Fix dynamic import resolution for global npm installations
- Update package.json bin to point to dist/bin/coderag.js
- Bump version to 1.0.1

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…ive LanceDB delete/upsert

- Remove unnecessary ?? 0 null coalescing in meanPool and embedBatch hot loops
  (replaced with explicit ! assertions for typed Float32Array access)
- Reduce ONNX batch size from 32 to 8 to avoid memory pressure during inference
- Add progress logging throughout embedding pipeline
- Parallelize batch embedding with Promise.all instead of sequential await
- Use native LanceDB table.delete() instead of full table reload on deletes
- Use delete+add pattern for upsert instead of loading all rows into memory

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…ops, replaced with explicit !

       assertions (same speed, less overhead)
     - Batch size reduced: 32 → 8 to avoid memory pressure during inference
     - Progress logging: Shows embedding progress throughout indexing
     - Parallel batches: Uses Promise.all instead of sequential await
     - Native LanceDB: Uses table.delete() instead of loading all rows and rewriting the entire table
- Add .qwen/, .serena/, and *.tgz to .gitignore
- Remove all previously tracked .qwen reasoning quality-gate reports
- Remove .serena configuration files from git tracking
- Remove published .tgz package from repository

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…rtable MCP discovery

- Chunked embedding processing: processes nodes in chunks to avoid
  holding all documents in memory at once
- Embedding text truncation: caps at 2048 chars (models cap at ~512
  tokens anyway, extra chars waste memory)
- Sequential ONNX inference: batch providers now process sequentially
  instead of Promise.all to avoid OOM from parallel WASM inference
- ONNX batch size: 8 → 1 to minimize memory pressure
- ONNX model: switched to Xenova/all-MiniLM-L6-v2 (better quality)
- ONNX WASM threads: limited to 1 to reduce memory pressure
- ONNX remote: allowRemoteModels always enabled (downloads if missing)
- MCP discovery script: resolves CLI from global npm package, PATH,
  or git repo fallback — no hardcoded absolute paths
- package.json: version bump 1.0.1 → 1.0.3

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…etrieval

Stage 1 (Decompose):
- decompose.ts: shouldDecompose() heuristic classifier scores questions
  on conjunctions, question marks, length, and keywords (score >= 2)
- decomposeQuestion() asks LLM to break complex questions into 2-5
  sub-questions with JSON response parsing and markdown fence stripping
- decomposeQuestionWithFallback() full pipeline with config gating

Stage 2 (Parallel Retrieve):
- multi-hop.ts: parallelRetrieve() runs vector+lexical search + graph
  traversal for each sub-question concurrently via Promise.all
- deduplicateAndMerge() merges results by nodeId, first occurrence wins
- multiHopRetrieve() end-to-end pipeline returning MultiHopRetrievalResult

Stage 3 (Synthesize):
- multi-hop-context-builder.ts: builds ContextPackage from multi-hop
  results with context budgeting, promotes first node to primary
- prompt.ts: buildMultiHopMessages() synthesis prompt with per-sub-
  question sections, instructs LLM to address each then unify

Integration:
- CodeRag.query(): gating via options.multiHop + config.multiHop.enabled
- CLI: --multi-hop flag on query command
- HTTP: multiHop field in POST /v1/query request body
- MCP: multiHop input on query tool
- Config: multiHop section with enabled, minQuestionLength,
  maxSubQuestions, expansionDepth + env var overrides

Types:
- MultiHopConfig schema, RetrievalMode type, DecompositionResult,
  MultiHopRetrievalResult, extended ContextPackage/QueryResult/
  RetrievedNodeContext with subQuestions, subQuestionResults,
  subQuestionIndex

Tests:
- decompose.test.ts: 9 tests for heuristic classifier
- multi-hop.test.ts: 3 tests for deduplication and metadata

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Copilot AI review requested due to automatic review settings April 8, 2026 03:47
@coderabbitai

coderabbitai Bot commented Apr 8, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR introduces a multi-language codebase analysis service layer with HTTP API, ONNX-based embeddings, multi-hop query decomposition, and interactive setup wizard. It replaces TypeScript-specific analysis with tree-sitter-backed multi-language support, adds configurable LLM integration, and provides both MCP and HTTP server interfaces.

Changes

Cohort / File(s) Summary
Core Service Layer
src/service/coderag.ts, src/service/config.ts, src/service/http.ts, src/cli/setup-wizard.ts
Introduces CodeRag service managing repository indexing/querying with state hydration and concurrent job coalescing; adds environment-aware configuration loading with .env parsing and runtime provider resolution; creates HTTP API server with JSON validation, auth, and error mapping; implements interactive setup wizard for configuration and git hook installation.
Multi-Hop Retrieval
src/retrieval/decompose.ts, src/retrieval/multi-hop.ts, src/llm/multi-hop-context-builder.ts, src/llm/prompt.ts
Adds question decomposition heuristic and LLM-based sub-question generation; implements parallel multi-hop retrieval with node deduplication and dependency expansion; provides multi-hop context assembly with context budgeting; builds decomposition-aware LLM prompts.
Embedding Providers
src/indexer/onnx-embedder.ts, src/indexer/embedder.ts, src/indexer/gemini-embedder.ts
Introduces ONNX-based embeddings via Xenova transformers with lazy initialization and local model caching; adds model property and embedBatch to local-hash provider; updates Gemini defaults to models/gemini-embedding-001, adds resolveGeminiApiKey for dual alias support (CODERAG_GEMINI_API_KEY/CODERAG_GEMINI_AI_KEY).
Graph & Document Indexing
src/adapters/codeflow-core.ts, src/indexer/documents.ts, src/indexer/indexer.ts, src/indexer/git-hook.ts
Replaces ts-morph AST utilities with direct codeflow-core multi-language analysis; adds chunked document embedding with GC and progress logging; updates manifest schema to include embedding dimensions and version; adds safe shell quoting and git hook installation check; tracks embedding fingerprints for mismatch detection.
Storage & Persistence
src/store/vector-store.ts, src/store/manifest-store.ts
Optimizes vector-store delete/upsert to use server-side filtering and avoids overfetching; adds Zod validation for loaded manifest/snapshot/documents; throws on metadata parse errors instead of swallowing them.
CLI & Entrypoints
src/bin/coderag.ts, src/cli.ts, scripts/coderag-mcp-discover.js
Adds new executable entrypoint with dynamic CLI loading; integrates setup command and validates --depth argument; adds --multi-hop flag forwarding; implements MCP discovery script for config resolution and index initialization.
Type System & Configuration
src/types.ts, coderag.config.json, package.json, .env.example
Extends schema with multi-hop config, ONNX/Gemini options, retrieval modes, and embedding dimensions; adds Zod schemas for persistence validation; updates package version to 1.0.3, adds multi-language keywords, replaces ts-morph with @xenova/transformers; documents new embedding providers and LLM configuration.
Testing
src/test/*.test.ts, vitest.config.ts
Comprehensive test suite covering CLI, config loading, embedders (Gemini/ONNX), multi-hop decomposition/retrieval, document indexing, git hooks, HTTP endpoints, vector/manifest stores, and search ranking; adds 20s test timeout.
Documentation & Project
README.md, AGENTS.md, prd/changes/*.md
Updates README with multi-language support, new CLI command, environment variables (embedding/ONNX/Gemini/LLM/multi-hop), HTTP server details; adds rule requiring future-ready features to be gated by config flags; auto-generated PRD sync records.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant HTTPServer as HTTP Server
    participant CodeRag
    participant Indexer
    participant LLM as LLM Transport
    participant VectorStore

    Client->>HTTPServer: POST /v1/query { question, multiHop? }
    HTTPServer->>CodeRag: query(question, { multiHop })
    
    alt multiHop enabled
        CodeRag->>CodeRag: decomposeQuestionWithFallback(question)
        CodeRag->>VectorStore: parallelRetrieve(subQuestions...)
        VectorStore-->>CodeRag: SubQuestionRetrievalResult[]
        CodeRag->>CodeRag: deduplicateAndMerge(results)
        CodeRag->>CodeRag: buildMultiHopContextPackage(...)
        CodeRag->>CodeRag: buildMultiHopMessages(context)
    else single retrieval
        CodeRag->>VectorStore: searchDocuments + rerankResults
        VectorStore-->>CodeRag: SearchResult[]
        CodeRag->>CodeRag: traverseDependencies(primaryNode)
        CodeRag->>CodeRag: buildContextPackage(...)
        CodeRag->>CodeRag: buildMessages(context)
    end

    CodeRag->>LLM: generate(messages)
    LLM-->>CodeRag: answer
    CodeRag->>HTTPServer: QueryResult { answer, context, retrievalMode }
    HTTPServer->>Client: 200 JSON response
Loading
sequenceDiagram
    participant User
    participant CLI as CLI Setup Wizard
    participant Readline
    participant FileSystem
    participant GitHook as Git Hook

    User->>CLI: coderag setup
    CLI->>Readline: Interactive prompts
    Readline->>User: Choose embedding provider (gemini/onnx/local-hash)
    User-->>Readline: Selection
    Readline->>User: Enable LLM? (ask for model/apiKey)
    User-->>Readline: Response
    Readline->>User: Repo path / storage root?
    User-->>Readline: Paths
    CLI->>FileSystem: Write coderag.config.json
    CLI->>FileSystem: Write .env (if API keys provided)
    CLI->>GitHook: installPostCommitHook(repoPath, configPath)
    GitHook->>FileSystem: Create .git/hooks/post-commit with CODERAG_MARKER
    GitHook-->>CLI: Installed
    CLI->>User: Setup complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Whiskers twitching with delight
Multi-hop questions now take flight,
ONNX embeddings, local and free,
An HTTP server serves with glee,
Setup wizard hops, git hooks installed right! 🥕✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/gemini-onnx-embedding-providers

@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Multi-hop retrieval with ONNX/Gemini embeddings, setup wizard, and enhanced configuration system

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• **Multi-hop retrieval pipeline**: Implements 3-stage query decomposition and parallel retrieval
  with LLM-based question decomposition, concurrent vector/graph search, and synthesis
• **ONNX embedding provider**: Adds local neural embeddings using Xenova transformers with
  lazy-loading, batch support, and memory-efficient chunked processing
• **Gemini embedding enhancements**: Updates model to gemini-embedding-001, adds API key alias
  support, improves batch request building with outputDimensionality
• **Interactive setup wizard**: New CLI wizard for configuration with embedding provider selection
  (local-hash, ONNX, Gemini) and LLM endpoint auto-detection (OpenRouter, OpenAI, Anthropic)
• **Configuration system**: .env file parsing with environment variable overrides, LLM provider
  auto-detection, and multi-hop configuration options
• **Vector store optimization**: SQL query optimization for node lookups, native delete operations,
  and improved error handling with IndexingError
• **Embedding fingerprint validation**: Full fingerprint comparison (provider, model, dimensions)
  with manifest storage and mismatch detection
• **Git hook improvements**: Shell escaping for POSIX safety, installation detection, and
  auto-installation after successful indexing
• **HTTP service enhancements**: Timing-safe authentication comparison, readiness probe with index
  validation, and multiHop query parameter support
• **MCP server auto-indexing**: Auto-reindex on startup if needed and model mismatch detection with
  auto-remediation
• **Type system expansion**: Comprehensive Zod schemas for persisted types, multi-hop retrieval
  types, and extended EmbeddingProvider interface with batch support
• **Comprehensive test coverage**: 202+ tests across 25+ test files covering decomposition,
  multi-hop retrieval, embedding providers, configuration, HTTP service, and vector store operations
Diagram
flowchart LR
  Query["Query with<br/>multiHop flag"]
  Decompose["Stage 1:<br/>Decompose"]
  SubQuestions["2-5 Sub-questions"]
  ParallelRetrieve["Stage 2:<br/>Parallel Retrieve"]
  VectorSearch["Vector +<br/>Lexical Search"]
  GraphExpand["Graph<br/>Expansion"]
  Deduplicate["Deduplication<br/>& Merge"]
  Synthesize["Stage 3:<br/>Synthesize"]
  ContextBuilder["Multi-hop<br/>Context Builder"]
  LLMSynthesis["LLM<br/>Synthesis"]
  Result["Unified<br/>Answer"]
  
  EmbeddingProviders["Embedding Providers"]
  LocalHash["local-hash"]
  ONNX["ONNX<br/>Xenova"]
  Gemini["Gemini<br/>API"]
  
  SetupWizard["Setup Wizard"]
  ConfigFile["coderag.config.json"]
  EnvFile[".env"]
  
  Query --> Decompose
  Decompose --> SubQuestions
  SubQuestions --> ParallelRetrieve
  ParallelRetrieve --> VectorSearch
  ParallelRetrieve --> GraphExpand
  VectorSearch --> Deduplicate
  GraphExpand --> Deduplicate
  Deduplicate --> ContextBuilder
  ContextBuilder --> Synthesize
  Synthesize --> LLMSynthesis
  LLMSynthesis --> Result
  
  EmbeddingProviders --> LocalHash
  EmbeddingProviders --> ONNX
  EmbeddingProviders --> Gemini
  
  SetupWizard --> ConfigFile
  SetupWizard --> EnvFile
Loading

Grey Divider

File Changes

1. src/test/config.test.ts 🧪 Tests +258/-1

Configuration loading with .env support and LLM auto-detection

• Added support for .env file parsing with validation for environment variable overrides
• Extended envKeys array to include CODERAG_GEMINI_AI_KEY, OPENROUTER_API_KEY,
 OPENAI_API_KEY, and ANTHROPIC_API_KEY
• Added 9 new test cases covering .env loading, API key aliases, and LLM provider auto-detection
 (OpenRouter, OpenAI, Anthropic)
• Tests verify explicit baseUrl takes precedence over auto-detection and that transport is not
 created when LLM is disabled

src/test/config.test.ts


2. src/adapters/codeflow-core.ts ✨ Enhancement +56/-235

Refactor to use codeflow-core multi-language analyzer

• Simplified adapter to use analyzeRepo instead of analyzeTypeScriptRepo for multi-language
 support
• Removed 190+ lines of TypeScript-specific AST parsing logic (buildProject, resolveNodeAst,
 getDeclarationKey, buildCallSiteMap)
• Delegated source span and call site extraction directly to codeflow-core's analyzeRepo result
• Added fallback for non-codeflow providers to return empty sourceSpans and callSites

src/adapters/codeflow-core.ts


3. src/indexer/documents.ts ✨ Enhancement +202/-32

Memory-efficient chunked embedding with batch support

• Added MAX_EMBEDDING_CHARS constant (2048) to truncate embedding text and reduce memory usage
• Implemented embedPreparedDocuments function with batch and sequential embedding support
• Added chunked processing with garbage collection every 100 chunks to prevent OOM with ONNX/WASM
• Refactored buildIndexedDocuments to process nodes in chunks, supporting both batch and
 sequential embedding providers
• Exported INDEX_SCHEMA_VERSION (bumped to 2) and added resolveEmbeddingMetadata helper
• Updated buildIndexManifest to include embeddingDimensions in manifest

src/indexer/documents.ts


View more (132)
4. src/cli/setup-wizard.ts ✨ Enhancement +251/-0

Interactive setup wizard for configuration and API keys

• New interactive CLI wizard for CodeRag configuration with readline-based prompts
• Supports selection of embedding providers (local-hash, onnx, gemini) and LLM endpoints
 (OpenRouter, OpenAI, Anthropic, custom)
• Auto-detects existing config files and environment variables for API keys
• Generates coderag.config.json and .env files with secure permissions (0o600)
• Auto-installs post-commit hook after configuration

src/cli/setup-wizard.ts


5. src/types.ts ✨ Enhancement +192/-6

Type definitions for multi-hop retrieval and ONNX support

• Added onnx to embeddingProviderKindSchema enum
• Added multiHopConfigSchema with fields for decomposition control (enabled, minQuestionLength,
 maxSubQuestions, expansionDepth)
• Added onnxModelDir to embeddingConfigSchema
• Added comprehensive Zod schemas for persisted types: sourceSpanSchema, callSiteSchema,
 indexedNodeDocumentSchema, graphSnapshotSchema, indexManifestSchema,
 vectorStoreMetadataSchema
• Extended EmbeddingProvider interface with model, maxBatchSize, and embedBatch method
• Added RetrievalMode type and extended QueryOptions, ContextPackage, QueryResult with
 multi-hop fields
• Added DecompositionResult and MultiHopRetrievalResult interfaces

src/types.ts


6. src/test/gemini-embedder.test.ts 🧪 Tests +228/-0

Comprehensive tests for Gemini embedding provider

• New test suite with 16 tests for GeminiEmbeddingProvider covering single and batch embedding
• Tests verify API key resolution (canonical and alias), model configuration, timeout handling, and
 error cases
• Validates batch size limits (100 max), response validation, and mismatched batch responses
• Tests both single embed() and batch embedBatch() methods with mocked fetch

src/test/gemini-embedder.test.ts


7. src/service/config.ts ✨ Enhancement +120/-12

Configuration resolution with .env parsing and LLM auto-detection

• Implemented .env file parsing with parseDotEnv and loadDotEnv functions supporting quoted
 values and escape sequences
• Added resolveGeminiApiKey import for API key alias support
• Added OnnxEmbeddingProvider instantiation when provider is onnx
• Implemented LLM provider auto-detection: checks OPENROUTER_API_KEY, OPENAI_API_KEY,
 ANTHROPIC_API_KEY to set baseUrl and transport
• Added validation that multiHop.expansionDepth does not exceed traversal.maxDepth
• Extended resolveRuntimeConfig to return llm config and configPath

src/service/config.ts


8. src/service/coderag.ts ✨ Enhancement +107/-22

Multi-hop retrieval integration with decomposition and synthesis

• Added multi-hop retrieval path: queryMultiHop method implementing 3-stage pipeline (decompose,
 parallel retrieve, synthesize)
• Integrated decomposeQuestionWithFallback, multiHopRetrieve, and buildMultiHopContextPackage
• Modified query to decide between single and multi-hop based on options.multiHop, config
 enablement, and LLM availability
• Updated status to use manifest for embedding metadata instead of vector store
• Refactored runIndex to runIndexJob for flexibility
• Updated reindex to accept options object with full and docsPath fields

src/service/coderag.ts


9. src/test/codeflow-core.test.ts 🧪 Tests +39/-105

Tests updated for multi-language graph provider

• Updated test expectations to reflect multi-language tree-sitter support
• Removed tests for TypeScript-specific AST functions (getDeclarationKey, resolveNodeAst)
• Simplified test to verify source spans and call sites are populated for multi-language repos
• Updated custom provider test to expect empty sourceSpans and callSites (only codeflow-core
 populates these)
• Added test verifying source span line numbers are valid

src/test/codeflow-core.test.ts


10. src/indexer/indexer.ts ✨ Enhancement +65/-23

Embedding fingerprint validation and auto-hook installation

• Added INDEX_SCHEMA_VERSION import and formatEmbeddingFingerprint helper
• Updated checkEmbeddingModelMismatch to compare full embedding fingerprint (provider, model,
 dimensions) from manifest
• Refactored reindex to accept options object and support incremental vs. full reindex modes
• Added ensurePostCommitHook method to auto-install git hook after successful indexing
• Updated manifest metadata storage to include embeddingDimensions
• Added logger parameter to buildIndexedDocuments call

src/indexer/indexer.ts


11. src/llm/multi-hop-context-builder.ts ✨ Enhancement +163/-0

Multi-hop context package builder with budgeting

• New module implementing multi-hop context assembly from retrieval results
• Builds ContextPackage with per-sub-question metadata and section boundaries
• Implements context budgeting to fit all nodes within maxContextChars
• Promotes first retrieved node to "primary" for display; remaining nodes are "related"
• Generates graph summary with sub-question breakdown and file references

src/llm/multi-hop-context-builder.ts


12. src/indexer/onnx-embedder.ts ✨ Enhancement +146/-0

ONNX embedding provider with local neural models

• New OnnxEmbeddingProvider class for local neural embeddings using Xenova transformers
• Implements lazy-loading of ONNX pipeline with model caching and single-threaded WASM execution
• Supports both single embed() and batch embedBatch() methods with mean pooling
• Validates required model files and auto-downloads from Hugging Face if needed
• Prevents model directory changes after initialization

src/indexer/onnx-embedder.ts


13. src/retrieval/multi-hop.ts ✨ Enhancement +159/-0

Parallel multi-hop retrieval with deduplication

• New module implementing parallel multi-hop retrieval pipeline
• retrieveForSubQuestion runs vector + lexical search and graph expansion per sub-question
• parallelRetrieve executes all sub-questions concurrently via Promise.alldeduplicateAndMerge deduplicates nodes by ID and tracks retrieval metadata per sub-question
• multiHopRetrieve orchestrates full pipeline returning deduplicated nodes and metadata

src/retrieval/multi-hop.ts


14. src/retrieval/decompose.ts ✨ Enhancement +130/-0

Query decomposition with heuristic and LLM fallback

• New module for query decomposition into sub-questions
• shouldDecompose heuristic classifier scores questions on conjunctions, question marks, length,
 and keywords
• decomposeQuestion calls LLM to break complex questions into 2-5 focused sub-questions
• decomposeQuestionWithFallback orchestrates heuristic check and LLM decomposition with graceful
 fallback
• Validates JSON response and enforces sub-question count bounds

src/retrieval/decompose.ts


15. src/llm/context-builder.ts ✨ Enhancement +1/-0

Add retrieval mode to context package

• Added retrievalMode: "single" field to ContextPackage returned by buildContextPackage

src/llm/context-builder.ts


16. src/test/multi-hop.test.ts 🧪 Tests +127/-0

Multi-hop retrieval deduplication and metadata tests

• Adds 3 new test cases for multi-hop retrieval deduplication and merging logic
• Tests verify that nodes appearing in multiple sub-questions are deduplicated correctly
• Tests confirm that retrieval metadata preserves sub-question context for each node
• Includes helper functions to construct test nodes and documents

src/test/multi-hop.test.ts


17. src/indexer/gemini-embedder.ts ✨ Enhancement +37/-20

Gemini embedder model update and request refactoring

• Updates default model from gemini-embedding-2-preview to gemini-embedding-001
• Adds constants for API key environment variables and batch size limits
• Introduces resolveGeminiApiKey() helper function supporting alias environment variable
• Refactors request building into buildEmbedRequest() method with outputDimensionality parameter
• Improves batch embedding to include model and outputDimensionality in each request
• Fixes batch size validation to use MAX_BATCH_SIZE constant instead of hardcoded 250

src/indexer/gemini-embedder.ts


18. src/cli.ts ✨ Enhancement +36/-7

CLI setup wizard and multi-hop query flag support

• Adds setup command to CLI with new runSetupWizard() import
• Adds --multi-hop boolean flag support for query command
• Implements parseDepthFlag() validation to ensure positive integers only
• Refactors config loading to be optional for setup command
• Passes logger to MCP server initialization
• Adds null-safe cleanup for coderag instance on exit

src/cli.ts


19. src/test/indexer.test.ts 🧪 Tests +86/-1

Reindex routing and embedding fingerprint validation tests

• Adds 5 new test cases for reindex routing, embedding fingerprint checking, and mismatch warnings
• Tests verify incremental vs full reindex mode routing
• Tests confirm embedding model mismatch detection and logging behavior
• Tests validate that incremental reindex throws when fingerprint mismatches

src/test/indexer.test.ts


20. src/test/documents.test.ts 🧪 Tests +74/-1

Document indexing batched embedding and external docs tests

• Adds BatchTestEmbeddingProvider mock class with maxBatchSize and embedBatch() support
• Adds test for external docs fallback when available, with generated content as fallback
• Adds test verifying batched embedding is used when provider supports it
• Adds test for local-hash defaults when no embedding metadata supplied
• Updates manifest test to verify embedding provider, model, and dimensions fields

src/test/documents.test.ts


21. src/store/vector-store.ts ✨ Enhancement +28/-16

Vector store SQL optimization and error handling improvements

• Adds IndexingError import for better error handling
• Refactors deleteByNodeIds() to use native SQL delete with createNodeIdFilter() instead of
 read-modify-write
• Refactors upsert() to delete existing rows then add new records instead of full table merge
• Optimizes getMany() to use SQL WHERE clause filter instead of prefix scan and in-memory
 filtering
• Adds helper functions toSqlStringLiteral() and createNodeIdFilter() for safe SQL generation
• Improves metadata loading error handling to throw IndexingError with context

src/store/vector-store.ts


22. src/test/http.test.ts 🧪 Tests +55/-1

HTTP service readiness probe and reindex flag tests

• Updates mock status to include indexedNodeCount and modelMismatch fields
• Adds test verifying readiness probe returns 503 when index is empty or mismatched
• Adds test verifying full flag is passed through to reindex routes
• Adds assertion for readiness response status code

src/test/http.test.ts


23. src/service/http.ts ✨ Enhancement +25/-7

HTTP service timing-safe auth and readiness probe enhancements

• Imports timingSafeEqual from node:crypto for constant-time auth comparison
• Adds multiHop boolean field to query request schema
• Refactors isAuthorized() to use timing-safe comparison with length check
• Adds isReadyStatus() helper to validate index readiness (indexed, non-empty, no mismatch)
• Updates /ready endpoint to return 503 when index is not ready
• Passes multiHop parameter through to coderag.query()
• Simplifies index handler to always use reindex() with full flag

src/service/http.ts


24. src/test/decompose.test.ts 🧪 Tests +66/-0

Query decomposition heuristic classifier tests

• Adds 9 test cases for shouldDecompose() heuristic classifier
• Tests verify decomposition scoring based on question length, conjunctions, keywords, and question
 marks
• Tests confirm graceful handling of edge cases and keyword variations

src/test/decompose.test.ts


25. src/llm/prompt.ts ✨ Enhancement +75/-0

Multi-hop LLM synthesis prompt builder

• Adds MULTI_HOP_SYSTEM_PROMPT constant for multi-hop synthesis instructions
• Adds formatSubQuestionSection() helper to format retrieved nodes per sub-question
• Adds buildMultiHopMessages() function to construct LLM messages for multi-hop synthesis
• Organizes context by sub-question index and formats with metadata boundaries

src/llm/prompt.ts


26. src/mcp/server.ts ✨ Enhancement +30/-5

MCP server auto-indexing and multi-hop query support

• Adds ensureIndexIsCurrent() function to auto-index on startup if needed
• Adds multiHop parameter to query tool schema and handler
• Updates MCP server version from 0.1.0 to 0.2.1
• Passes logger option to serveStdioMcpServer() for startup logging
• Implements auto-reindex on model mismatch detection

src/mcp/server.ts


27. src/test/vector-store.test.ts 🧪 Tests +56/-0

Vector store SQL query and metadata error handling tests

• Adds imports for fs and IndexingError
• Adds test verifying direct SQL query for requested node IDs instead of prefix scan
• Adds test verifying metadata loading throws IndexingError on invalid JSON
• Adds test verifying clear() removes both rows and metadata
• Adds test for empty store operations (get, getMany, getMetadata)

src/test/vector-store.test.ts


28. src/store/manifest-store.ts ✨ Enhancement +17/-6

Manifest store Zod schema validation for persisted state

• Adds Zod schema imports for validation (graphSnapshotSchema, indexedNodeDocumentSchema,
 indexManifestSchema)
• Refactors loadOptionalJson() to accept and apply Zod schema validation
• Adds documentMapSchema for validating document record maps
• Updates all load methods to validate loaded JSON against schemas

src/store/manifest-store.ts


29. src/test/git-hook.test.ts 🧪 Tests +32/-2

Git hook installation detection and shell escaping tests

• Adds isPostCommitHookInstalled() import and 3 new test cases
• Tests verify hook detection when not installed, after installation, and for non-git directories
• Updates existing test to verify single-quote shell escaping in generated hook

src/test/git-hook.test.ts


30. src/indexer/git-hook.ts ✨ Enhancement +23/-3

Git hook shell escaping and installation detection

• Adds shellQuote() helper function for POSIX shell single-quote escaping
• Adds isPostCommitHookInstalled() function to check hook presence
• Refactors hook script generation to use shellQuote() for configPath and backupHookPath
• Improves shell script safety by escaping all interpolated paths

src/indexer/git-hook.ts


31. src/test/onnx-embedder.test.ts 🧪 Tests +41/-0

ONNX embedding provider configuration tests

• Adds 2 new test cases for OnnxEmbeddingProvider configuration
• Tests verify logger option exposure and model directory configuration
• Tests confirm expected model name, dimensions, and batch size defaults

src/test/onnx-embedder.test.ts


32. src/bin/coderag.ts ✨ Enhancement +37/-0

CLI executable entry point with path resolution

• Adds new executable entry point for CLI with shebang
• Resolves CLI path from installed package or git repo fallback
• Constructs fake argv array for runCli() compatibility
• Provides usage help and error handling for missing commands

src/bin/coderag.ts


33. src/test/coderag.test.ts 🧪 Tests +7/-0

CodeRag status embedding metadata assertions

• Updates status test to verify embedding provider, model, and dimensions fields
• Adds assertions for embedding metadata in status responses
• Tests both configured and unconfigured embedding provider scenarios

src/test/coderag.test.ts


34. src/test/manifest-store.test.ts 🧪 Tests +8/-0

Manifest store embedding metadata test updates

• Updates manifest save/load test to include embedding provider, model, and dimensions fields
• Adds schemaVersion field to test manifest structure

src/test/manifest-store.test.ts


35. src/test/cli.test.ts 🧪 Tests +13/-0

CLI depth flag validation tests

• Adds test verifying --depth flag validation rejects non-positive integers
• Tests confirm error is thrown before query execution for invalid depth values

src/test/cli.test.ts


36. src/index.ts ✨ Enhancement +3/-0

Public API exports for new modules

• Exports OnnxEmbeddingProvider from indexer module
• Exports isPostCommitHookInstalled and installPostCommitHook from git-hook module
• Exports runSetupWizard from CLI setup-wizard module

src/index.ts


37. src/indexer/embedder.ts ✨ Enhancement +5/-0

Local hash embedder batch support and model property

• Adds model property to LocalHashEmbeddingProvider class
• Adds embedBatch() method for batch embedding support

src/indexer/embedder.ts


38. src/test/search.test.ts 🧪 Tests +1/-0

Test embedding provider model property

• Adds model property to TestEmbeddingProvider mock class

src/test/search.test.ts


39. vitest.config.ts ⚙️ Configuration changes +1/-0

Vitest timeout configuration increase

• Adds testTimeout: 20000 configuration to increase test timeout to 20 seconds

vitest.config.ts


40. scripts/coderag-mcp-discover.js ✨ Enhancement +107/-0

MCP server auto-discovery and initialization script

• Adds new discovery script for auto-configuring and launching MCP server
• Implements config file search up directory tree with fallback to default creation
• Auto-indexes repository if .coderag storage directory is missing
• Resolves CLI path from global npm package, which, or git repo fallback
• Spawns MCP server with discovered or created configuration

scripts/coderag-mcp-discover.js


41. .qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-04-run-tests.md 📝 Documentation +263/-0

Test execution results documentation

• Documents test execution results showing 202 tests passing across 25 test files
• Confirms TypeScript type-check clean
• Shows successful test run with 4.32s total duration

.qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-04-run-tests.md


42. .qwen/reasoning/quality-gates/post-commit-20260406-184329/stage-02-security.md 📝 Documentation +176/-0

Security analysis findings and remediation guidance

• Documents 6 security findings (3 medium, 3 low severity)
• Identifies command injection risks in git-hook.ts shell script generation
• Identifies timing-safe auth comparison issue in HTTP service
• Identifies world-readable .env file permissions and API key exposure risks
• Identifies HSTS header not applied for unencrypted sockets

.qwen/reasoning/quality-gates/post-commit-20260406-184329/stage-02-security.md


43. .qwen/reasoning/quality-gates/post-commit-20260406-184853/stage-02-security.md 📝 Documentation +127/-0

Test embedder config security and compilation analysis

• Documents security analysis for test embedder config file
• Identifies missing type imports causing compilation failure
• Identifies missing API key field in config interface
• Identifies broad string types reducing type safety

.qwen/reasoning/quality-gates/post-commit-20260406-184853/stage-02-security.md


44. .qwen/reasoning/quality-gates/post-commit-20260406-184853/changed-files-context.txt 📝 Documentation +34/-0

Test embedder config file context documentation

• Provides context dump of test embedder config file changes
• Shows embedder factory function and config interface definitions

.qwen/reasoning/quality-gates/post-commit-20260406-184853/changed-files-context.txt


45. claude-code/reasoning/quality-gates/post-commit-20260406-152438/stage-03-fix-security.md 📝 Documentation +11/-0

Security fix stage completion documentation

• Documents security fix stage results showing no issues required fixing
• Confirms hardcoded secrets, SQL injection, and eval/exec checks passed

claude-code/reasoning/quality-gates/post-commit-20260406-152438/stage-03-fix-security.md


46. .qwen/reasoning/quality-gates/post-commit-20260406-184329/changed-files-context.txt 🧪 Tests +2096/-0

Comprehensive test suite for CodeRag core functionality

• Comprehensive test suite additions covering CLI, CodeRag service, document indexing, git hooks,
 HTTP service, indexer, manifest store, search, and vector store functionality
• Tests validate core features including indexing, querying, reindexing, status reporting, and error
 handling
• Includes mock implementations and integration tests for embedding providers, vector stores, and
 HTTP endpoints
• Tests cover edge cases like empty indices, model mismatches, concurrent indexing, and malformed
 inputs

.qwen/reasoning/quality-gates/post-commit-20260406-184329/changed-files-context.txt


47. .qwen/reasoning/quality-gates/post-commit-20260406-183933/changed-files-context.txt ✨ Enhancement +930/-0

Core CodeRag service and HTTP API implementation

CodeRag service class implementing high-level API for indexing, querying, and graph traversal
 operations
• loadCodeRagConfig and resolveRuntimeConfig functions for loading serializable config and
 resolving runtime dependencies
• HTTP server implementation with routes for query, lookup, explain, impact, index, reindex, status,
 health, and metrics endpoints
• Configuration loading from disk and environment variables with support for multiple embedding
 providers (Gemini, ONNX, local-hash)

.qwen/reasoning/quality-gates/post-commit-20260406-183933/changed-files-context.txt


48. .qwen/reasoning/quality-gates/pre-commit-20260406-183932/COMMIT_PLAN.md ⚙️ Configuration changes +103/-0

Atomic commit staging plan for multi-layer changes

• Atomic commit plan staging 30 files across 5 logical commits
• Commits organized by layer: business logic, tests, configuration, documentation, and other changes
• Follows Conventional Commits format with clear commit messages
• Includes execution order and principles for atomic, reversible, and testable commits

.qwen/reasoning/quality-gates/pre-commit-20260406-183932/COMMIT_PLAN.md


49. prd/changes/20260407-234654.md 📝 Documentation +39/-0

PRD change tracking and implementation context

• PRD change entry documenting 5 modified files across 22 implementation commits
• Tracks changes to CLI setup wizard, LLM context builder, service layer, config, and types
• Confirms no deviations from PRD specification
• Includes recent implementation context with commit history

prd/changes/20260407-234654.md


50. .DS_Store Additional files +0/-0

...

.DS_Store


51. .env.example Additional files +14/-4

...

.env.example


52. .qwen/reasoning/quality-gates/post-commit-20260406-154419/POST_COMMIT_REPORT.md Additional files +58/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/POST_COMMIT_REPORT.md


53. .qwen/reasoning/quality-gates/post-commit-20260406-154419/context-summary.md Additional files +23/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/context-summary.md


54. .qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-01-linting.md


55. .qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-02-security.md Additional files +18/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-02-security.md


56. .qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-03-fix-security.md Additional files +11/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-03-fix-security.md


57. .qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-04-run-tests.md Additional files +191/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-04-run-tests.md


58. .qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-05-add-tests.md Additional files +12/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-05-add-tests.md


59. .qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-06-documentation.md Additional files +11/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-06-documentation.md


60. .qwen/reasoning/quality-gates/post-commit-20260406-182414/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-182414/stage-01-linting.md


61. .qwen/reasoning/quality-gates/post-commit-20260406-182414/stage-02-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-182414/stage-02-security.md


62. .qwen/reasoning/quality-gates/post-commit-20260406-183232/POST_COMMIT_REPORT.md Additional files +58/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/POST_COMMIT_REPORT.md


63. .qwen/reasoning/quality-gates/post-commit-20260406-183232/context-summary.md Additional files +23/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/context-summary.md


64. .qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-01-linting.md


65. .qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-02-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-02-security.md


66. .qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-03-fix-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-03-fix-security.md


67. .qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-04-run-tests.md Additional files +191/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-04-run-tests.md


68. .qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-05-add-tests.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-05-add-tests.md


69. .qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-06-documentation.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-06-documentation.md


70. .qwen/reasoning/quality-gates/post-commit-20260406-183826/POST_COMMIT_REPORT.md Additional files +58/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/POST_COMMIT_REPORT.md


71. .qwen/reasoning/quality-gates/post-commit-20260406-183826/context-summary.md Additional files +23/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/context-summary.md


72. .qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-01-linting.md


73. .qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-02-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-02-security.md


74. .qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-03-fix-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-03-fix-security.md


75. .qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-04-run-tests.md Additional files +194/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-04-run-tests.md


76. .qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-05-add-tests.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-05-add-tests.md


77. .qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-06-documentation.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183826/stage-06-documentation.md


78. .qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt Additional files +930/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt


79. .qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-01-linting.md Additional files +11/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-01-linting.md


80. .qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-03-fix-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-03-fix-security.md


81. .qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-04-run-tests.md Additional files +194/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-04-run-tests.md


82. .qwen/reasoning/quality-gates/post-commit-20260406-183933/test-context.txt Additional files +930/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-183933/test-context.txt


83. .qwen/reasoning/quality-gates/post-commit-20260406-184329/stage-01-linting.md Additional files +12/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-184329/stage-01-linting.md


84. .qwen/reasoning/quality-gates/post-commit-20260406-184853/stage-01-linting.md Additional files +12/-0

...

.qwen/reasoning/quality-gates/post-commit-20260406-184853/stage-01-linting.md


85. .qwen/reasoning/quality-gates/post-commit-20260407-110303/POST_COMMIT_REPORT.md Additional files +58/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/POST_COMMIT_REPORT.md


86. .qwen/reasoning/quality-gates/post-commit-20260407-110303/context-summary.md Additional files +23/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/context-summary.md


87. .qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-01-linting.md


88. .qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-02-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-02-security.md


89. .qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-03-fix-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-03-fix-security.md


90. .qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-04-run-tests.md Additional files +179/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-04-run-tests.md


91. .qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-05-add-tests.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-05-add-tests.md


92. .qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-06-documentation.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-110303/stage-06-documentation.md


93. .qwen/reasoning/quality-gates/post-commit-20260407-123108/POST_COMMIT_REPORT.md Additional files +58/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/POST_COMMIT_REPORT.md


94. .qwen/reasoning/quality-gates/post-commit-20260407-123108/context-summary.md Additional files +23/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/context-summary.md


95. .qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-01-linting.md


96. .qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-02-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-02-security.md


97. .qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-03-fix-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-03-fix-security.md


98. .qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-04-run-tests.md Additional files +179/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-04-run-tests.md


99. .qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-05-add-tests.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-05-add-tests.md


100. .qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-06-documentation.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-123108/stage-06-documentation.md


101. .qwen/reasoning/quality-gates/post-commit-20260407-154349/POST_COMMIT_REPORT.md Additional files +58/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-154349/POST_COMMIT_REPORT.md


102. .qwen/reasoning/quality-gates/post-commit-20260407-154349/context-summary.md Additional files +23/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-154349/context-summary.md


103. .qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-01-linting.md Additional files +6/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-01-linting.md


104. .qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-02-security.md Additional files +5/-0

...

.qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-02-security.md


105. .qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-03-fix-security.md Additional files +5/-0

...

<a href='https://github.com/nehraa/CodeRag/pull/3/files#diff-55e83ba3954ce6fc5e3090222e6fad7a...

@qodo-code-review

qodo-code-review Bot commented Apr 8, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (3)   📘 Rule violations (2)   📎 Requirement gaps (0)   🎨 UX Issues (0)
🐞\ ≡ Correctness (2) ☼ Reliability (1)
📘\ ⚙ Maintainability (1) ◔ Observability (1)

Grey Divider


Action required

1. No retrieval-only CLI indicator 📘
Description
The result schema does not include explicit degradation metadata, and the CLI prints retrieval-only
(answerMode: context-only) output without a clear human-readable notice. This can mislead users
and downstream consumers about whether an answer was generated or the system degraded to
context-only retrieval.
Code

src/types.ts[R357-363]

export interface QueryResult {
  question: string;
  answerMode: AnswerMode;
+  retrievalMode: RetrievalMode;
  answer: string;
  context: ContextPackage;
}
Evidence
The checklist requires setting a structured degradation indicator and rendering an explicit CLI
notice whenever the system switches to retrieval-only mode (PR Compliance ID 269566). QueryResult
contains answerMode/retrievalMode but no canonical degradation metadata field, and the CLI
branch for answerMode === context-only prints the output without any explicit retrieval-only
notice.

Rule 269566: Indicate retrieval-only degradation in metadata and CLI output
src/types.ts[357-363]
src/cli.ts[181-186]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Retrieval-only degradation is not explicitly represented in a dedicated metadata field, and the CLI does not print a clear notice when operating in retrieval-only mode.

## Issue Context
Compliance requires both: (1) a structured degradation indicator in the response object, and (2) a human-readable CLI message when degraded.

## Fix Focus Areas
- src/types.ts[357-363]
- src/service/coderag.ts[231-311]
- src/cli.ts[163-188]
- src/test/cli.test.ts[1-220]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Multi-hop prompt drops code 🐞
Description
buildMultiHopMessages only includes node headers and graph summary (no docs/file excerpts), and
OpenAiCompatibleTransport sends only messages to the model, so multi-hop answers will be generated
without the retrieved code context.
Code

src/llm/prompt.ts[R129-193]

+const formatSubQuestionSection = (
+  index: number,
+  question: string,
+  nodes: RetrievedNodeContext[]
+): string => {
+  if (nodes.length === 0) {
+    return `Sub-question ${index + 1}: "${question}"\nNo matching code found.`;
+  }
+
+  const nodeEntries = nodes
+    .map((node, ni) => `${ni + 1}. ${formatNodeHeader(node)}`)
+    .join("\n");
+
+  return `Sub-question ${index + 1}: "${question}"\nRetrieved ${nodes.length} node(s):\n${nodeEntries}`;
+};
+
+/**
+ * Builds LLM messages for multi-hop synthesis.
+ * Instructs the model to address each sub-question then unify the answer.
+ */
+export const buildMultiHopMessages = (
+  question: string,
+  context: ContextPackage
+): LlmRequest["messages"] => {
+  const subQuestions = context.subQuestions ?? [];
+  const nodeByIndex = new Map<number, RetrievedNodeContext[]>();
+
+  for (const node of context.relatedNodes) {
+    const idx = node.subQuestionIndex ?? 0;
+    const list = nodeByIndex.get(idx) ?? [];
+    list.push(node);
+    nodeByIndex.set(idx, list);
+  }
+
+  const subQuestionSections = subQuestions
+    .map((sq, i) => {
+      const nodes = nodeByIndex.get(i) ?? [];
+      return formatSubQuestionSection(i, sq, nodes);
+    })
+    .join("\n\n");
+
+  const userContent = [
+    `Question:\n${question}`,
+    ``,
+    `This question was decomposed into ${subQuestions.length} sub-questions.`,
+    `Context was retrieved independently for each, then deduplicated and merged.`,
+    ``,
+    subQuestionSections,
+    ``,
+    `Graph summary:`,
+    context.graphSummary,
+    ``,
+    formatWarnings(context.warnings),
+    ``,
+    `Answer the question comprehensively. Address each sub-question specifically,`,
+    `then synthesize into a unified answer. If some sub-questions couldn't be`,
+    `answered from the context, explicitly state what's missing.`
+  ]
+    .filter(Boolean)
+    .join("\n");
+
+  return [
+    { role: "system", content: MULTI_HOP_SYSTEM_PROMPT },
+    { role: "user", content: userContent }
+  ];
Evidence
In the single-hop path, buildMessages uses summarizeContext(), which includes doc sections and file
excerpts for primary/related nodes. In the multi-hop path, buildMultiHopMessages formats
per-sub-question sections using formatNodeHeader() only (metadata) and never includes doc/file
content; for OpenAI-compatible transport, generate() posts only the chat messages payload (no
context field), so the LLM cannot see the retrieved code content at all.

src/llm/prompt.ts[38-119]
src/llm/prompt.ts[121-193]
src/llm/transports.ts[281-311]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Multi-hop answers are generated without retrieved code content for OpenAI-compatible transports because `buildMultiHopMessages()` does not include any doc/file excerpts, and `OpenAiCompatibleTransport` only sends `messages`.

### Issue Context
Single-hop uses `summarizeContext()` which includes truncated docs and file excerpts. Multi-hop should include comparable context content (either reuse `summarizeContext()` or add per-sub-question sections that include docs/file excerpts for the nodes).

### Fix Focus Areas
- src/llm/prompt.ts[121-193]
- src/llm/transports.ts[281-311]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Subquestion mapping always zero 🐞
Description
buildMultiHopContextPackage never assigns subQuestionIndex to retrieved node contexts, but
buildMultiHopMessages groups nodes by subQuestionIndex ?? 0, so nearly all nodes are attributed to
sub-question 1 and later sub-questions misleadingly show "No matching code found."
Code

src/llm/multi-hop-context-builder.ts[R43-107]

+const buildRelatedNodeContexts = async (
+  nodes: BlueprintNode[],
+  repoPath: string,
+  fileCache: FileCache,
+  snapshot: GraphSnapshot,
+  documents: Record<string, IndexedNodeDocument>,
+  subQuestionIndex?: number
+): Promise<RetrievedNodeContext[]> => {
+  const contexts: RetrievedNodeContext[] = [];
+
+  for (const node of nodes) {
+    const doc = documents[node.id];
+    if (!doc) {
+      continue;
+    }
+
+    const ctx = await createRetrievedNodeContext(
+      repoPath,
+      fileCache,
+      snapshot,
+      doc,
+      "multi-hop"
+    );
+    if (subQuestionIndex !== undefined) {
+      ctx.subQuestionIndex = subQuestionIndex;
+    }
+    contexts.push(ctx);
+  }
+
+  return contexts;
+};
+
+/**
+ * Builds a ContextPackage from multi-hop retrieval results.
+ * Unlike the single-node path, there is no single primary node.
+ * The first retrieved node is promoted to "primary" for display purposes,
+ * and all remaining nodes are listed as related.
+ */
+export const buildMultiHopContextPackage = async (
+  question: string,
+  subQuestions: string[],
+  retrievalResult: MultiHopRetrievalResult,
+  repoPath: string,
+  snapshot: GraphSnapshot,
+  documents: Record<string, IndexedNodeDocument>,
+  retrieval: RetrievalConfig,
+  fileCache: FileCache
+): Promise<ContextPackage> => {
+  const allNodes = retrievalResult.deduplicatedNodes;
+
+  // Build RetrievedNodeContext for all deduplicated nodes
+  const allContexts = await buildRelatedNodeContexts(
+    allNodes,
+    repoPath,
+    fileCache,
+    snapshot,
+    documents
+  );
+
+  // Promote the first node to "primary" for display
+  const firstCtx = allContexts[0];
+  const primaryContext: RetrievedNodeContext | null = firstCtx
+    ? Object.assign({}, firstCtx, { relationship: "primary" as const, subQuestionIndex: undefined })
+    : null;
+  const relatedContexts: RetrievedNodeContext[] = allContexts.length > 1 ? allContexts.slice(1) : [];
Evidence
The context builder has a subQuestionIndex parameter and only sets ctx.subQuestionIndex when
that argument is provided, but the multi-hop package builder calls it without the index. The prompt
builder then defaults subQuestionIndex to 0 when organizing nodes per sub-question.

src/llm/multi-hop-context-builder.ts[43-107]
src/llm/prompt.ts[149-168]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Multi-hop prompt organization is incorrect because retrieved nodes are not tagged with the sub-question that retrieved them.

### Issue Context
`buildRelatedNodeContexts()` already supports assigning `subQuestionIndex`, but `buildMultiHopContextPackage()` currently builds a single deduplicated list without mapping nodes back to sub-questions.

### Fix Focus Areas
- src/llm/multi-hop-context-builder.ts[43-107]
- src/llm/multi-hop-context-builder.ts[81-163]
- src/llm/prompt.ts[149-168]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. CodeRag constructs RepoIndexer 📘
Description
CodeRag directly instantiates the concrete RepoIndexer inside the service constructor, coupling
the service layer to a specific indexing implementation. This reduces testability and violates the
requirement to depend on interfaces/adapters with concrete implementations wired only in
composition/DI layers.
Code

src/service/coderag.ts[58]

+    this.indexer = new RepoIndexer(config, config.configPath);
Evidence
The checklist requires repo analysis/embedding/LLM/persistence dependencies to be injected behind
interfaces instead of being constructed inside application/service logic (PR Compliance ID 269565).
The modified constructor line constructs new RepoIndexer(...) directly in CodeRag.

Rule 269565: Decouple repo analysis, embeddings, LLM transport, and persistence behind interfaces/adapters
src/service/coderag.ts[57-60]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`CodeRag` constructs `RepoIndexer` directly, which couples the service layer to a concrete implementation.

## Issue Context
Per compliance, concrete implementations should be created in a composition/DI layer and injected into consumers so tests can substitute fakes/mocks without requiring real infrastructure.

## Fix Focus Areas
- src/service/coderag.ts[50-60]
- src/index.ts[1-200]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Promise.all aborts multi-hop 🐞
Description
parallelRetrieve uses Promise.all, so if any single sub-question retrieval throws (e.g.,
embeddingProvider.embed failure), the entire multi-hop query fails instead of returning partial
results.
Code

src/retrieval/multi-hop.ts[R65-79]

+export const parallelRetrieve = async (
+  subQuestions: string[],
+  documents: Record<string, IndexedNodeDocument>,
+  embeddingProvider: EmbeddingProvider,
+  retrieval: RetrievalConfig,
+  snapshot: GraphSnapshot,
+  vectorStore: VectorStore | undefined,
+  expansionDepth: number
+): Promise<SubQuestionRetrievalResult[]> => {
+  const results = await Promise.all(
+    subQuestions.map((sq) =>
+      retrieveForSubQuestion(sq, documents, embeddingProvider, retrieval, snapshot, vectorStore, expansionDepth)
+    )
+  );
+  return results;
Evidence
parallelRetrieve is implemented with Promise.all, which rejects on the first rejected promise.
retrieveForSubQuestion calls searchDocuments, which awaits embeddingProvider.embed(expandedQuestion)
without a try/catch, so a single embedding failure will reject the whole Promise.all batch.

src/retrieval/multi-hop.ts[65-79]
src/retrieval/search.ts[193-216]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Multi-hop retrieval is not fault-tolerant: one failing sub-question aborts the entire retrieval.

### Issue Context
Multi-hop retrieval can still be useful with partial results; failures should be isolated per sub-question and surfaced as warnings.

### Fix Focus Areas
- src/retrieval/multi-hop.ts[62-80]
- src/retrieval/multi-hop.ts[131-165]
- src/service/coderag.ts[317-379]
- src/llm/multi-hop-context-builder.ts[109-163]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly expands CodeRag's capabilities, introducing multi-language support for Go, Python, Rust, C, and C++ via tree-sitter, a local ONNX embedding provider, and a multi-hop retrieval pipeline for handling complex queries. It also adds an interactive setup wizard, optimizes indexing with memory-efficient chunking, and improves security with timing-safe authorization. Review feedback identifies a hardcoded API key in the sample configuration, a runtime error in the discovery script due to ESM constraints, and the need for better documentation or logging regarding the manual garbage collection trigger.

Comment thread coderag.config.json
"transport": "openai-compatible",
"baseUrl": "https://openrouter.ai/api/v1",
"model": "stepfun/step-3.5-flash",
"apiKey": "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",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

A hardcoded API key is present in this configuration file. This is a major security risk as it exposes credentials in the source code. API keys should be loaded from environment variables (e.g., OPENROUTER_API_KEY, which is already supported by the application) or a secure secret management system, not committed to the repository. Please remove this line from the configuration.

function resolveCliPath() {
// Try 1: globally installed npm package via require.resolve
try {
const pkgPath = require.resolve("@abhinav2203/coderag/package.json");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of require.resolve will cause a runtime error. Since the project is configured as an ES module ("type": "module" in package.json), require is not defined in the global scope.

To fix this, you can use createRequire from the module package to construct a require function that works within an ES module. Add the following at the top of the script:

import { createRequire } from 'node:module';
const require = createRequire(import.meta.url);

This will allow require.resolve to function correctly without changing the module type.

Comment thread src/indexer/documents.ts
Comment on lines +130 to +132
if (completedChunks % 100 === 0 && globalThis.gc) {
globalThis.gc();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Explicitly invoking the garbage collector with globalThis.gc() can lead to unpredictable performance and is only available when Node.js is run with the --expose-gc flag. While the existing comment explains the intent is to reclaim memory from the ONNX runtime, this dependency on a runtime flag should be clearly documented for anyone running this service. Consider adding a warning log if globalThis.gc is not available to make debugging memory issues easier in environments where the flag is not set.

Comment thread src/types.ts
Comment on lines 357 to 363
export interface QueryResult {
question: string;
answerMode: AnswerMode;
retrievalMode: RetrievalMode;
answer: string;
context: ContextPackage;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. No retrieval-only cli indicator 📘 Rule violation ◔ Observability

The result schema does not include explicit degradation metadata, and the CLI prints retrieval-only
(answerMode: context-only) output without a clear human-readable notice. This can mislead users
and downstream consumers about whether an answer was generated or the system degraded to
context-only retrieval.
Agent Prompt
## Issue description
Retrieval-only degradation is not explicitly represented in a dedicated metadata field, and the CLI does not print a clear notice when operating in retrieval-only mode.

## Issue Context
Compliance requires both: (1) a structured degradation indicator in the response object, and (2) a human-readable CLI message when degraded.

## Fix Focus Areas
- src/types.ts[357-363]
- src/service/coderag.ts[231-311]
- src/cli.ts[163-188]
- src/test/cli.test.ts[1-220]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/llm/prompt.ts
Comment on lines +129 to +193
const formatSubQuestionSection = (
index: number,
question: string,
nodes: RetrievedNodeContext[]
): string => {
if (nodes.length === 0) {
return `Sub-question ${index + 1}: "${question}"\nNo matching code found.`;
}

const nodeEntries = nodes
.map((node, ni) => `${ni + 1}. ${formatNodeHeader(node)}`)
.join("\n");

return `Sub-question ${index + 1}: "${question}"\nRetrieved ${nodes.length} node(s):\n${nodeEntries}`;
};

/**
* Builds LLM messages for multi-hop synthesis.
* Instructs the model to address each sub-question then unify the answer.
*/
export const buildMultiHopMessages = (
question: string,
context: ContextPackage
): LlmRequest["messages"] => {
const subQuestions = context.subQuestions ?? [];
const nodeByIndex = new Map<number, RetrievedNodeContext[]>();

for (const node of context.relatedNodes) {
const idx = node.subQuestionIndex ?? 0;
const list = nodeByIndex.get(idx) ?? [];
list.push(node);
nodeByIndex.set(idx, list);
}

const subQuestionSections = subQuestions
.map((sq, i) => {
const nodes = nodeByIndex.get(i) ?? [];
return formatSubQuestionSection(i, sq, nodes);
})
.join("\n\n");

const userContent = [
`Question:\n${question}`,
``,
`This question was decomposed into ${subQuestions.length} sub-questions.`,
`Context was retrieved independently for each, then deduplicated and merged.`,
``,
subQuestionSections,
``,
`Graph summary:`,
context.graphSummary,
``,
formatWarnings(context.warnings),
``,
`Answer the question comprehensively. Address each sub-question specifically,`,
`then synthesize into a unified answer. If some sub-questions couldn't be`,
`answered from the context, explicitly state what's missing.`
]
.filter(Boolean)
.join("\n");

return [
{ role: "system", content: MULTI_HOP_SYSTEM_PROMPT },
{ role: "user", content: userContent }
];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Multi-hop prompt drops code 🐞 Bug ≡ Correctness

buildMultiHopMessages only includes node headers and graph summary (no docs/file excerpts), and
OpenAiCompatibleTransport sends only messages to the model, so multi-hop answers will be generated
without the retrieved code context.
Agent Prompt
### Issue description
Multi-hop answers are generated without retrieved code content for OpenAI-compatible transports because `buildMultiHopMessages()` does not include any doc/file excerpts, and `OpenAiCompatibleTransport` only sends `messages`.

### Issue Context
Single-hop uses `summarizeContext()` which includes truncated docs and file excerpts. Multi-hop should include comparable context content (either reuse `summarizeContext()` or add per-sub-question sections that include docs/file excerpts for the nodes).

### Fix Focus Areas
- src/llm/prompt.ts[121-193]
- src/llm/transports.ts[281-311]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +43 to +107
const buildRelatedNodeContexts = async (
nodes: BlueprintNode[],
repoPath: string,
fileCache: FileCache,
snapshot: GraphSnapshot,
documents: Record<string, IndexedNodeDocument>,
subQuestionIndex?: number
): Promise<RetrievedNodeContext[]> => {
const contexts: RetrievedNodeContext[] = [];

for (const node of nodes) {
const doc = documents[node.id];
if (!doc) {
continue;
}

const ctx = await createRetrievedNodeContext(
repoPath,
fileCache,
snapshot,
doc,
"multi-hop"
);
if (subQuestionIndex !== undefined) {
ctx.subQuestionIndex = subQuestionIndex;
}
contexts.push(ctx);
}

return contexts;
};

/**
* Builds a ContextPackage from multi-hop retrieval results.
* Unlike the single-node path, there is no single primary node.
* The first retrieved node is promoted to "primary" for display purposes,
* and all remaining nodes are listed as related.
*/
export const buildMultiHopContextPackage = async (
question: string,
subQuestions: string[],
retrievalResult: MultiHopRetrievalResult,
repoPath: string,
snapshot: GraphSnapshot,
documents: Record<string, IndexedNodeDocument>,
retrieval: RetrievalConfig,
fileCache: FileCache
): Promise<ContextPackage> => {
const allNodes = retrievalResult.deduplicatedNodes;

// Build RetrievedNodeContext for all deduplicated nodes
const allContexts = await buildRelatedNodeContexts(
allNodes,
repoPath,
fileCache,
snapshot,
documents
);

// Promote the first node to "primary" for display
const firstCtx = allContexts[0];
const primaryContext: RetrievedNodeContext | null = firstCtx
? Object.assign({}, firstCtx, { relationship: "primary" as const, subQuestionIndex: undefined })
: null;
const relatedContexts: RetrievedNodeContext[] = allContexts.length > 1 ? allContexts.slice(1) : [];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Subquestion mapping always zero 🐞 Bug ≡ Correctness

buildMultiHopContextPackage never assigns subQuestionIndex to retrieved node contexts, but
buildMultiHopMessages groups nodes by subQuestionIndex ?? 0, so nearly all nodes are attributed to
sub-question 1 and later sub-questions misleadingly show "No matching code found."
Agent Prompt
### Issue description
Multi-hop prompt organization is incorrect because retrieved nodes are not tagged with the sub-question that retrieved them.

### Issue Context
`buildRelatedNodeContexts()` already supports assigning `subQuestionIndex`, but `buildMultiHopContextPackage()` currently builds a single deduplicated list without mapping nodes back to sub-questions.

### Fix Focus Areas
- src/llm/multi-hop-context-builder.ts[43-107]
- src/llm/multi-hop-context-builder.ts[81-163]
- src/llm/prompt.ts[149-168]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces multi-hop retrieval (LLM query decomposition → parallel sub-question retrieval → LLM synthesis) to improve recall on complex questions, while also expanding indexing/embedding capabilities (Gemini/ONNX, embedding fingerprints, batching), and adding setup/ops improvements (dotenv loading, CLI/setup wizard, readiness probe changes, MCP auto-indexing, hook management).

Changes:

  • Added multi-hop retrieval pipeline (decomposition heuristics + LLM decomposition, parallel retrieval, dedup/merge, multi-hop context + prompt).
  • Improved indexing + embedding infrastructure (provider fingerprints: provider/model/dimensions, batched embeddings, ONNX provider, stricter persisted-state schemas, LanceDB optimizations/metadata handling).
  • Enhanced operational tooling (dotenv loading, setup wizard, CLI bin wrapper, readiness probe logic, MCP auto-indexing, post-commit hook install/checks).

Reviewed changes

Copilot reviewed 133 out of 137 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
vitest.config.ts Increases test timeout for longer-running tests.
src/types.ts Adds multi-hop config/types and embedding provider fingerprint fields.
src/store/vector-store.ts Optimizes LanceDB delete/upsert + adds metadata error handling.
src/store/manifest-store.ts Adds Zod validation when loading persisted manifest/snapshot/documents.
src/service/config.ts Adds .env loading, ONNX support, multi-hop env overrides, LLM auto-detection, configPath propagation.
src/service/http.ts Adds multiHop query option, timing-safe auth compare, readiness probe gating, reindex flag pass-through.
src/service/coderag.ts Wires multi-hop query path + retrievalMode into QueryResult/ContextPackage.
src/retrieval/decompose.ts Implements heuristic + LLM-backed query decomposition with fallback.
src/retrieval/multi-hop.ts Implements parallel retrieval per sub-question and dedup/merge result assembly.
src/llm/context-builder.ts Adds retrievalMode to single-hop context packages.
src/llm/multi-hop-context-builder.ts Builds multi-hop context packages with budgeting and per-sub-question metadata.
src/llm/prompt.ts Adds multi-hop synthesis prompt builder.
src/indexer/documents.ts Adds chunked/batched embedding pipeline + embedding manifest schema v2 with fingerprints.
src/indexer/embedder.ts Adds model + embedBatch for local-hash embedding provider.
src/indexer/gemini-embedder.ts Adds batching, API key aliasing, explicit dimensionality, model default update.
src/indexer/onnx-embedder.ts Adds ONNX embedding provider via @xenova/transformers.
src/indexer/git-hook.ts Adds hook installation checks + safer shell quoting for paths.
src/indexer/indexer.ts Adds embedding fingerprint mismatch logic and hook auto-install after index.
src/adapters/codeflow-core.ts Switches to tree-sitter analyzeRepo path + uses returned spans/callsites.
src/mcp/server.ts Adds multiHop to MCP tool + auto-index/reindex on startup.
src/cli.ts Adds setup command, --multi-hop flag, improved depth parsing, MCP logger wiring.
src/cli/setup-wizard.ts New interactive setup wizard for config + optional .env + hook install.
src/bin/coderag.ts New ESM-safe bin entrypoint wrapper for global CLI execution.
src/index.ts Re-exports ONNX provider, git hook helpers, setup wizard.
scripts/coderag-mcp-discover.js Adds MCP discovery/auto-init helper script.
README.md Updates docs for multi-language support, .env loading, provider behavior, ops notes.
package.json Bumps version, updates bin entrypoint, updates dependencies/keywords.
.env.example Updates env var docs for embedding providers and .env autoloading.
.gitignore Ignores additional tool output dirs and tarballs.
src/test/*.test.ts Adds/updates tests for new multi-hop, embedding providers, manifests, HTTP readiness, CLI flags, etc.
prd/changes/*.md PRD change log entries (generated).
claude-code/, .qwen/ Generated quality-gate/report artifacts (non-runtime).
coderag.config.json Adds a sample config (but currently includes a secret; see PR comments).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cli/setup-wizard.ts
Comment on lines +58 to +62
const embeddingProvider = await select(rl, "Select embedding provider:", [
"local-hash (free, offline, fast but low quality)",
"onnx (local neural embeddings via Xenova/gte-small, requires download)",
"gemini (cloud, best quality, requires API key)"
]);

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup wizard advertises the ONNX model as Xenova/gte-small, but the actual default in OnnxEmbeddingProvider is Xenova/all-MiniLM-L6-v2 (and tests assert that). Please align the wizard text/options with the real default model (or update the provider to match what the wizard promises) to avoid broken setup instructions.

Copilot uses AI. Check for mistakes.
Comment thread coderag.config.json
Comment on lines +21 to +23
"model": "stepfun/step-3.5-flash",
"apiKey": "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",
"timeoutMs": 45000

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config file is committed with an llm.apiKey value. Config files in the repo should not contain API keys or other secrets. Please remove the key from version control (use .env/environment variables instead) and rotate the affected credential.

Copilot uses AI. Check for mistakes.
Comment thread README.md
Comment on lines +104 to +108
When `embedding.provider` is `gemini`, CodeRag defaults to `models/gemini-embedding-001` and requests 768-dimensional vectors explicitly so the stored embedding fingerprint matches the vectors written to LanceDB. It accepts either `CODERAG_GEMINI_API_KEY` or the compatibility alias `CODERAG_GEMINI_AI_KEY`.

When `embedding.provider` is `onnx`, CodeRag uses `Xenova/gte-small` (384-dim, ~33MB) running locally via `@xenova/transformers`. No API key or external server needed. The model must be downloaded to `<onnxModelDir>/Xenova/gte-small/` (default `.coderag-models/models/Xenova/gte-small/`).

```bash

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README says the ONNX embedding provider uses Xenova/gte-small, but the implementation defaults to Xenova/all-MiniLM-L6-v2 (and tests assert that). Please align the README model name + download snippet with the actual default (or change the provider default) so users download the correct model.

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +53
it("returns true regardless of config.enabled (gating is done by caller)", () => {
const disabledConfig = { ...defaultConfig, enabled: false };
const question = "How does the mixnet handle key exchange and header-only forwarding?";
// shouldDecompose is a pure heuristic — the caller (decomposeQuestionWithFallback)
// checks config.enabled before calling this

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test constructs disabledConfig but then calls shouldDecompose(question, defaultConfig) rather than using disabledConfig. That makes the assertion not match the test name/comment and may hide regressions if shouldDecompose later starts using config.enabled.

Copilot uses AI. Check for mistakes.
Comment thread src/indexer/documents.ts
Comment on lines +95 to +99
const embedPreparedDocuments = async (
preparedDocuments: PreparedIndexedDocument[],
embeddingProvider: EmbeddingProvider,
logger?: { info: (msg: string, ctx?: Record<string, unknown>) => void }
): Promise<IndexedNodeDocument[]> => {

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

embedPreparedDocuments is declared but never used in this module (embedding logic is duplicated later in buildIndexedDocuments). Dead code increases maintenance cost; either remove this helper or refactor to reuse it.

Copilot uses AI. Check for mistakes.
Comment on lines +91 to +95
const allNodes = retrievalResult.deduplicatedNodes;

// Build RetrievedNodeContext for all deduplicated nodes
const allContexts = await buildRelatedNodeContexts(
allNodes,

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-hop prompt grouping relies on RetrievedNodeContext.subQuestionIndex, but buildMultiHopContextPackage currently builds contexts from retrievalResult.deduplicatedNodes without preserving which sub-question retrieved each node. Consider building a nodeId→subQuestionIndex map from retrievalResult.retrievalMetadata and assigning it when creating contexts.

Copilot uses AI. Check for mistakes.
Comment thread src/llm/prompt.ts
Comment on lines +153 to +156
const subQuestions = context.subQuestions ?? [];
const nodeByIndex = new Map<number, RetrievedNodeContext[]>();

for (const node of context.relatedNodes) {

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildMultiHopMessages only iterates over context.relatedNodes, so context.primaryNode is never included in any sub-question section. This can omit the most relevant node from the synthesis prompt; include primaryNode in the node-to-section mapping.

Copilot uses AI. Check for mistakes.
Comment thread src/cli/setup-wizard.ts
Comment on lines +58 to +62
const embeddingProvider = await select(rl, "Select embedding provider:", [
"local-hash (free, offline, fast but low quality)",
"onnx (local neural embeddings via Xenova/gte-small, requires download)",
"gemini (cloud, best quality, requires API key)"
]);

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup wizard advertises ONNX as using Xenova/gte-small, but OnnxEmbeddingProvider defaults to Xenova/all-MiniLM-L6-v2. Please align the wizard text/options with the actual default model (or update the provider default) so users download/configure the correct model.

Copilot uses AI. Check for mistakes.
Comment thread package.json
Comment on lines +3 to +10
"version": "1.0.3",
"description": "Standalone code retrieval and MCP server for multi-language repositories built on @abhinav2203/codeflow-core.",
"license": "Apache-2.0",
"type": "module",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"bin": {
"coderag": "dist/cli.js"
"coderag": "dist/bin/coderag.js"

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title/description focus on multi-hop retrieval, but this PR also changes packaging/release surface (version bump and CLI bin path changes). Consider splitting these release/packaging changes into a separate PR (or updating the PR description to reflect the broader scope) to reduce review and release risk.

Copilot uses AI. Check for mistakes.
Comment thread src/indexer/indexer.ts
Comment on lines +223 to 227
// Auto-install post-commit hook if not already present
await this.ensurePostCommitHook();

return buildIndexSummary(snapshot, manifest, documents);
});

Copilot AI Apr 8, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexing now auto-installs a git post-commit hook as a side effect. Modifying a repo’s hooks implicitly can be surprising in CI/production or for users with stricter policies. Consider making this opt-in (config flag / explicit coderag init / setup wizard) or at least skipping auto-install in non-interactive contexts.

Copilot uses AI. Check for mistakes.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/indexer/gemini-embedder.ts (1)

137-145: ⚠️ Potential issue | 🟠 Major

Reject partial batch responses instead of returning [].

Normalizing a missing values field to an empty vector turns a malformed provider response into an invalid fixed-dimension embedding and can corrupt the next index/write. This should fail fast the same way embed() does. As per coding guidelines "Config files, env vars, MCP tool input, HTTP responses, and persisted snapshots must be schema-validated before use".

🩹 Suggested fix
-      return data.embeddings.map((emb) => emb.values ?? []);
+      return data.embeddings.map((emb) => {
+        if (!emb.values) {
+          throw new Error("Invalid response from Gemini API: missing embedding values");
+        }
+        return emb.values;
+      });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/indexer/gemini-embedder.ts` around lines 137 - 145, The current mapping
normalizes missing embedding `values` to an empty array which can corrupt the
index; instead, validate the provider response and fail fast: after confirming
`data.embeddings` length matches `texts.length`, iterate `data.embeddings` and
for each entry (from `data.embeddings` / the `emb` items used in the final
`map`) verify `values` exists and is an array of numbers (and optionally has
expected dimensionality); if any `values` is missing or invalid, throw an Error
describing the malformed Gemini API response rather than returning `[]`,
otherwise return the array of `values`.
src/service/config.ts (1)

270-290: ⚠️ Potential issue | 🟠 Major

Preserve the auto-discovered config path.

This function already loaded an auto-discovered config earlier, but here configPath is only reconstructed from the explicit argument. Auto-discovered configs therefore come back with configPath: undefined even though a concrete file path was found and used.

🩹 Suggested fix
 export const loadCodeRagConfig = async (cwd: string, configPath?: string): Promise<CodeRagConfig> => {
-  const serializableConfig = await loadSerializableConfig(cwd, configPath);
-  const runtimeConfig = resolveRuntimeConfig(serializableConfig, cwd);
-  const resolvedConfigPath = configPath ? path.resolve(cwd, configPath) : undefined;
+  const resolvedConfigPath = await resolveConfigPath(cwd, configPath);
+  const serializableConfig = await loadSerializableConfig(cwd, configPath);
+  const runtimeConfig = resolveRuntimeConfig(serializableConfig, cwd);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/service/config.ts` around lines 270 - 290, The function loadCodeRagConfig
loses the auto-discovered config file path because resolvedConfigPath is only
built from the explicit configPath argument; use the path returned by
loadSerializableConfig when available. Update loadCodeRagConfig to set
resolvedConfigPath to configPath ? path.resolve(cwd, configPath) :
(serializableConfig.configPath || serializableConfig.resolvedPath || undefined)
so that if loadSerializableConfig returned a concrete file path (inspect
serializableConfig for its actual path property) that path is preserved in the
returned CodeRagConfig; keep existing validation and return structure unchanged.
README.md (1)

63-103: ⚠️ Potential issue | 🟡 Minor

Document how to enable multi-hop end-to-end.

The README adds the new embedding and service knobs, but the multi-hop surface introduced in this PR still isn't discoverable here: there's no multiHop config example, no CODERAG_MULTI_HOP_* env list, and no mention of the query/HTTP/MCP toggles. Please add the enablement path in the config and CLI/API sections so operators can actually turn the feature on.

Also applies to: 201-210

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 63 - 103, Add documentation for the new multi-hop
feature: describe the multiHop config block (e.g., multiHop.enable,
multiHop.mode/transports or multiHop.query/http/mcp toggles) and provide a
concrete example config and CLI/API enablement path, list the corresponding
environment variables (prefixed CODERAG_MULTI_HOP_*, e.g.,
CODERAG_MULTI_HOP_ENABLE, CODERAG_MULTI_HOP_TRANSPORTS,
CODERAG_MULTI_HOP_TIMEOUT) alongside the existing env list, and mention how to
enable/disable multi-hop via the CLI flags or service API toggles so operators
can turn the feature on/off and select transports (query/HTTP/MCP).
🟠 Major comments (21)
.qwen/reasoning/quality-gates/post-impl-20260407-154204/IMPLEMENTATION_REPORT.md-18-18 (1)

18-18: ⚠️ Potential issue | 🟠 Major

Replace absolute local path with a repo-relative path.

This line leaks local machine/user info and is not portable across environments.

🔧 Proposed fix
-**Change Entry:** /Users/abhinavnehra/git/CodeRag/prd/changes/20260407-154204.md
+**Change Entry:** prd/changes/20260407-154204.md
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-impl-20260407-154204/IMPLEMENTATION_REPORT.md
at line 18, The "Change Entry" line contains an absolute local path that leaks
user/machine info; update the string
"/Users/abhinavnehra/git/CodeRag/prd/changes/20260407-154204.md" in the
IMPLEMENTATION_REPORT.md entry to a repo-relative reference (e.g.,
"changes/20260407-154204.md" or "./changes/20260407-154204.md") so the file is
portable and does not expose local file system details.
.qwen/reasoning/quality-gates/post-commit-20260406-182414/stage-01-linting.md-1-6 (1)

1-6: ⚠️ Potential issue | 🟠 Major

Remove this file — it's already excluded by .gitignore and should not be committed to version control.

The .qwen/ directory is explicitly listed in .gitignore (line 6), yet multiple auto-generated quality gate logs from this directory are present in the repository. This creates a policy violation: these timestamped artifacts (post-commit-20260406-154419, post-commit-20260406-182414, post-commit-20260406-183232, etc.) should be excluded from version control.

Delete the entire .qwen/reasoning/quality-gates/ directory from this commit. Quality gate results are better served by CI/CD systems, build artifacts, or external reporting—not version-controlled snapshots that become stale and accumulate clutter.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-commit-20260406-182414/stage-01-linting.md
around lines 1 - 6, Remove the autogenerated quality-gate artifacts under
.qwen/reasoning/quality-gates by deleting the entire directory (including files
like post-commit-20260406-182414/stage-01-linting.md) from the commit and
repository; ensure you run git rm -r --cached .qwen/reasoning/quality-gates (or
git rm -r .qwen/reasoning/quality-gates if already tracked) and commit the
removal so the directory is no longer version-controlled, and verify .gitignore
already contains .qwen/ so future artifacts are not re-added.
.qwen/reasoning/quality-gates/post-impl-20260406-183922/IMPLEMENTATION_REPORT.md-18-18 (1)

18-18: ⚠️ Potential issue | 🟠 Major

Replace absolute local path with a repo-relative path.

Line 18 leaks machine-specific path details and is not portable across environments.

Proposed fix
-**Change Entry:** /Users/abhinavnehra/git/CodeRag/prd/changes/20260406-183922.md
+**Change Entry:** prd/changes/20260406-183922.md
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-impl-20260406-183922/IMPLEMENTATION_REPORT.md
at line 18, The "Change Entry" line currently embeds an absolute
machine-specific path; replace that absolute path with a repo-relative path (for
example changes/20260406-183922.md or ./changes/20260406-183922.md) so the file
reference is portable. Edit the line that contains the Change Entry string in
IMPLEMENTATION_REPORT.md to remove the /Users/... absolute path and use the
repo-relative path instead, and verify the link/reference format remains valid
in the document.
claude-code/reasoning/quality-gates/post-commit-20260406-152438/stage-04-run-tests.md-3-3 (1)

3-3: ⚠️ Potential issue | 🟠 Major

Stage status is inconsistent with the captured test result.

The file says FAIL, but the output shows all test files/tests passed. This should be corrected to avoid false gate reporting.

Proposed fix
-**Status:** FAIL
+**Status:** PASS

Also applies to: 117-120

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@claude-code/reasoning/quality-gates/post-commit-20260406-152438/stage-04-run-tests.md`
at line 3, The document contains an incorrect test status label: update the
"Status: FAIL" token in the header to "Status: PASS" and also correct the
repeated occurrences around lines 117-120 so the file's stage status matches the
captured test results; ensure any other "FAIL" entries in this file that
correspond to the same stage are changed to "PASS" to avoid false gate
reporting.
.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-04-run-tests.md-12-190 (1)

12-190: ⚠️ Potential issue | 🟠 Major

Redact machine-specific paths and identifiers in committed test artifacts.

This log includes absolute local paths and unique request IDs, which are unnecessary in-repo and can leak environment identifiers. Please sanitize/redact these values before committing generated reports.

Suggested edit pattern
- RUN  v4.1.0 /Users/abhinavnehra/git/CodeRag
+ RUN  v4.1.0 <repo-root>

- {"level":"error","message":"CodeRag HTTP request failed.","requestId":"77e07521-e8e3-45e0-a588-821beecc4f35",...}
+ {"level":"error","message":"CodeRag HTTP request failed.","requestId":"<redacted>",...}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-commit-20260406-183232/stage-04-run-tests.md
around lines 12 - 190, The committed test artifact contains machine-specific
values (absolute paths under "repoPath", UUID-like "requestId" values, and other
env-specific strings like indexedNode counts or local-hash tokens) that must be
redacted; update the test reporter/generator that produces these logs (look for
the code that emits "Indexed repository", "Auto-installing post-commit hook for
incremental indexing.", and the structured error logs with keys "requestId" and
"repoPath") to sanitize outputs by replacing absolute paths with a stable
placeholder (e.g. "<REPO_PATH>"), normalizing UUIDs/requestIds to "<REQUEST_ID>"
and hashing or removing other unique tokens (like local-hash values) via
regex-based redaction before writing the artifact, and add unit tests to assert
no raw absolute paths or UUID patterns remain in the saved report.
src/indexer/onnx-embedder.ts-22-24 (1)

22-24: ⚠️ Potential issue | 🟠 Major

Avoid process-global ONNX pipeline state.

pipelineInstance and initializedModelDir are shared across every OnnxEmbeddingProvider in the process, and they are only populated after the async load completes. Two cold-start embeds can therefore initialize the model twice, and one host can permanently lock later hosts to its modelDir.

💡 Suggested fix
 let pipelineInstance: ((input: string | string[]) => Promise<TensorLike>) | undefined = undefined;
 let initializedModelDir: string | undefined = undefined;
+let pipelinePromise: Promise<((input: string | string[]) => Promise<TensorLike>)> | undefined = undefined;

 const getPipeline = async (modelDir: string, logger?: Logger) => {
-  if (pipelineInstance) {
+  if (pipelinePromise) {
     if (initializedModelDir !== modelDir) {
       throw new ConfigurationError(
         "ONNX embedding provider model directory cannot be changed after initialization."
       );
     }
-    return pipelineInstance;
+    return pipelinePromise;
   }

-  const mod = await import("@xenova/transformers");
-
-  const modelPath = path.join(modelDir, DEFAULT_MODEL);
-  const hasLocalModel = await modelFilesExist(modelPath);
-
-  if (!hasLocalModel) {
-    logger?.info("ONNX embedding model not found locally, downloading to", { modelPath });
-  }
-
-  mod.env.allowRemoteModels = true;
-  mod.env.localModelPath = modelDir;
-  mod.env.backends.onnx.wasm.numThreads = 1;
-
-  const extractor = await mod.pipeline("feature-extraction", DEFAULT_MODEL, {
-    quantized: true
-  }) as (input: string | string[]) => Promise<TensorLike>;
-
-  pipelineInstance = extractor;
   initializedModelDir = modelDir;
-  return extractor;
+  pipelinePromise = (async () => {
+    const mod = await import("@xenova/transformers");
+    const modelPath = path.join(modelDir, DEFAULT_MODEL);
+    const hasLocalModel = await modelFilesExist(modelPath);
+
+    if (!hasLocalModel) {
+      logger?.info("ONNX embedding model not found locally, downloading to", { modelPath });
+    }
+
+    mod.env.allowRemoteModels = true;
+    mod.env.localModelPath = modelDir;
+    mod.env.backends.onnx.wasm.numThreads = 1;
+
+    const extractor = await mod.pipeline("feature-extraction", DEFAULT_MODEL, {
+      quantized: true
+    }) as (input: string | string[]) => Promise<TensorLike>;
+
+    pipelineInstance = extractor;
+    return extractor;
+  })().catch((error) => {
+    pipelinePromise = undefined;
+    pipelineInstance = undefined;
+    initializedModelDir = undefined;
+    throw error;
+  });
+
+  return pipelinePromise;
 };
Based on learnings: Repo analysis, embeddings, LLM transport, and persistence should stay behind focused interfaces so other hosts can plug CodeRag in later.

Also applies to: 33-66

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/indexer/onnx-embedder.ts` around lines 22 - 24, The code uses
process-global variables pipelineInstance and initializedModelDir which causes
cross-instance conflicts; change these to instance-scoped properties on the
OnnxEmbeddingProvider class (e.g., this.pipelineInstance,
this.initializedModelDir) and ensure the async model load/initialization is
performed per-instance (or guarded with an instance-level mutex/Promise) inside
the class constructor or an init method (e.g., loadModel or ensureInitialized)
so each OnnxEmbeddingProvider manages its own pipeline lifecycle and does not
overwrite or block other providers.
.qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt-229-296 (1)

229-296: ⚠️ Potential issue | 🟠 Major

The embedded multi-hop path is still unreachable.

Per the PR objective, callers should be able to opt into the 3-stage pipeline, but this CodeRag.query() implementation never inspects options.multiHop, and the /v1/query schema/handler do not accept or forward that flag either. Requests will silently fall back to the old single-hop flow.

Also applies to: 604-608, 774-779

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt
around lines 229 - 296, The query method in CodeRag (query) never reads or
forwards the options.multiHop flag so the 3-stage multi-hop pipeline is
unreachable; update query to inspect options.multiHop and, when true and the
config supports multi-hop, enable the multi-hop path by passing the flag into
buildContextPackage (so it can assemble multi-hop context) and into
this.config.llmTransport.generate (so the transport can run the 3-stage flow),
and ensure the /v1/query request/schema and handler accept and forward multiHop
to the CodeRag.query call (also apply the same change sites referenced around
the other occurrences at the noted blocks).
src/indexer/onnx-embedder.ts-48-54 (1)

48-54: 🛠️ Refactor suggestion | 🟠 Major

Gate remote model downloads behind config.

When the local assets are missing, this path silently turns initialization into a networked fallback by always enabling remote models. That gives offline or pinned-model deployments no way to enforce local-only behavior.

As per coding guidelines: If a feature is correctly implemented but blocked by external platform constraints, gate it behind an optional config flag rather than removing it, and document the flag and its support status in README.md.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/indexer/onnx-embedder.ts` around lines 48 - 54, The current code
unconditionally enables remote model downloads by setting
mod.env.allowRemoteModels when hasLocalModel is false; change this to respect a
configuration flag (e.g., enableRemoteModelDownloads or
process.env.ENABLE_REMOTE_MODELS) so remote fallback is only enabled when the
flag is true and local assets are missing; only set mod.env.allowRemoteModels =
true if !hasLocalModel AND the config flag is enabled, keep
mod.env.localModelPath = modelDir as is, and add a README.md note documenting
the new flag and its supported behavior (offline/pinned deployments must set the
flag false).
.qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt-379-391 (1)

379-391: ⚠️ Potential issue | 🟠 Major

Do not write repo-local .env entries into global process state.

The embedded src/service/config.ts helper mutates process.env, so the first repo loaded in a long-lived host can leak API keys and defaults into later CodeRag instances. Keep dotenv values local to this config resolution path instead of persisting them process-wide.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt
around lines 379 - 391, The loadDotEnv helper currently mutates global
process.env (see loadDotEnv, DOTENV_FILE, parseDotEnv, fileExists,
readTextFile); change it to not write to process.env but instead return a local
Record<string,string> (or undefined) of dotenv entries so callers can merge them
into a local config scope. Update loadDotEnv signature to
Promise<Record<string,string> | undefined>, populate and return the parsed
entries (do not call process.env[key] = value), and update any callers to accept
and use the returned map when resolving config rather than relying on
process.env side-effects.
claude-code/reasoning/quality-gates/post-commit-20260406-152438/stage-02-security.md-3-18 (1)

3-18: ⚠️ Potential issue | 🟠 Major

Status inconsistency: FAIL status contradicts the content.

The stage is marked FAIL (line 3), but the content shows:

  • npm audit found 0 vulnerabilities (line 9)
  • All 6 security checks are marked ✅ complete (lines 13-18)

The only issue mentioned is a deprecation warning about using --omit=dev instead of --production, which is not a security failure. If all checks passed and no vulnerabilities were found, the status should be PASS.

📋 Recommended fix
-**Status:** FAIL
+**Status:** PASS
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@claude-code/reasoning/quality-gates/post-commit-20260406-152438/stage-02-security.md`
around lines 3 - 18, Change the top-level status string "Status: FAIL" to
"Status: PASS" and ensure the narrative matches (e.g., keep the npm deprecation
warning as an informational note, not a failure); update any summary sentence or
metadata that asserts failure so the file's header and body consistently reflect
that all security checks passed and zero vulnerabilities were found (look for
the literal "Status: FAIL" and the npm audit block to make the change).
.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-04-run-tests.md-3-191 (1)

3-191: ⚠️ Potential issue | 🟠 Major

Status inconsistency: FAIL status contradicts passing test results.

The stage is marked FAIL (line 3), but the test execution output clearly shows:

  • Test Files 25 passed (25) (line 187)
  • Tests 203 passed (203) (line 188)
  • All test suites display ✓ passing marks

No test failures are present in the captured output. If all tests passed, the status should be PASS.

📋 Recommended fix
-**Status:** FAIL
+**Status:** PASS
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-commit-20260406-154419/stage-04-run-tests.md
around lines 3 - 191, The status line ("Status: FAIL") is incorrect despite all
tests passing; update the document so the "Status:" header reflects the test
summary (change to "Status: PASS") and, if this file is generated, fix the
generation logic that sets the "Status:" token (look for the code that writes
"Status:" or the template that outputs stage-04-run-tests.md and the string
"Status:") to derive PASS/FAIL from the test summary counts (e.g., check "Test
Files .* passed" / "Tests .* passed" instead of hardcoding FAIL). Ensure the
"Status:" value is programmatically set to PASS when the parsed test summary
shows all tests passed.
.qwen/reasoning/quality-gates/post-impl-20260407-110256/IMPLEMENTATION_REPORT.md-18-18 (1)

18-18: ⚠️ Potential issue | 🟠 Major

Remove absolute local path from committed report metadata.

This leaks local environment/user info and is not portable. Store a repo-relative path instead.

Suggested fix
-**Change Entry:** /Users/abhinavnehra/git/CodeRag/prd/changes/20260407-110256.md
+**Change Entry:** prd/changes/20260407-110256.md
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-impl-20260407-110256/IMPLEMENTATION_REPORT.md
at line 18, The "Change Entry:" metadata currently contains an absolute local
path ("/Users/abhinavnehra/.../changes/20260407-110256.md") in
.qwen/reasoning/quality-gates/post-impl-20260407-110256/IMPLEMENTATION_REPORT.md;
replace that absolute path with a repo-relative path (for example
"changes/20260407-110256.md" or "./changes/20260407-110256.md") so the file no
longer exposes local user/environment information and remains portable across
clones, and verify any code that reads this metadata (if any) expects and
resolves repo-relative paths instead of absolute ones.
.qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-04-run-tests.md-11-263 (1)

11-263: ⚠️ Potential issue | 🟠 Major

Redact absolute local paths before committing generated logs.

Line 11 and subsequent lines expose local machine paths (for example /Users/...), which can leak user/workstation identifiers in repository history.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-commit-20260407-154349/stage-04-run-tests.md
around lines 11 - 263, The generated test log contains absolute local paths
(e.g. user home and temp dirs shown in the run output) that must be redacted
before committing; update the log-producing step that writes the
stage-04-run-tests.md artifact to run a sanitizer that replaces absolute path
patterns (e.g. /Users/<username>/ and /var/folders/...) with a stable
placeholder like <REDACTED_PATH>, and add this sanitizer to the CI/job that
creates the file and/or a pre-commit hook so any new generated logs (including
the stage-04-run-tests.md output) are automatically scrubbed before commit.
src/bin/coderag.ts-2-3 (1)

2-3: ⚠️ Potential issue | 🟠 Major

Use file-URL conversion for dynamic import to avoid Windows breakage.

Line 18 imports a raw filesystem path constructed by path.join(). On Windows, this produces an absolute path like C:\...\cli.js, which Node's ESM loader rejects with ERR_UNSUPPORTED_ESM_URL_SCHEME (it only accepts file:, data:, and node: schemes). Convert cliPath with pathToFileURL(...).href before import() for cross-platform correctness.

Additionally, the usage text (lines 22–30) omits the --multi-hop flag added by this PR.

Proposed fix
-import { fileURLToPath } from "node:url";
+import { fileURLToPath, pathToFileURL } from "node:url";
 import path from "node:path";
@@
-const { runCli } = await import(cliPath);
+const { runCli } = await import(pathToFileURL(cliPath).href);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bin/coderag.ts` around lines 2 - 3, The dynamic import is using a raw
filesystem path built with path.join (cliPath) which breaks on Windows; change
the runtime to import the file URL by calling pathToFileURL(cliPath).href
(import pathToFileURL from "node:url") and pass that into import(), and update
the CLI usage string (the usage/help text variable) to include the new
--multi-hop flag so the displayed options match the PR.
src/test/onnx-embedder.test.ts-1-4 (1)

1-4: ⚠️ Potential issue | 🟠 Major

Replace static imports with dynamic imports in test cases to ensure vi.resetModules() isolation is effective.

The static import at line 3 binds OnnxEmbeddingProvider before vi.resetModules() runs, so module-level state (pipelineInstance and initializedModelDir) cannot be reset between tests. If these tests or future tests call embed() or embedBatch(), singleton state will leak across cases. Dynamic imports after resetModules() ensure each test gets a fresh module instance.

🧪 Suggested fix
-import { OnnxEmbeddingProvider } from "../indexer/onnx-embedder.js";
 import { cleanupPaths, createTempDir } from "./helpers.js";

 describe("OnnxEmbeddingProvider auto-download", () => {
   beforeEach(() => {
     // Reset the module-level singleton by clearing the module cache
     vi.resetModules();
   });

   afterEach(async () => {
     // Clear any model cache that might have been created
     vi.resetModules();
   });

-  it("exposes the logger option in the config", () => {
+  it("exposes the logger option in the config", async () => {
+    const { OnnxEmbeddingProvider } = await import("../indexer/onnx-embedder.js");
     const logger = { debug: vi.fn(), info: vi.fn(), warn: vi.fn(), error: vi.fn() };
     const provider = new OnnxEmbeddingProvider({ logger });

   it("checks for model files before enabling remote download", async () => {
+    const { OnnxEmbeddingProvider } = await import("../indexer/onnx-embedder.js");
     const tmpDir = await createTempDir("coderag-onnx-test-");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/test/onnx-embedder.test.ts` around lines 1 - 4, The tests statically
import OnnxEmbeddingProvider which locks module-level singletons
(pipelineInstance, initializedModelDir) before vi.resetModules() can run; change
the tests to use dynamic import(...) for "../indexer/onnx-embedder.js" after
calling vi.resetModules() in each test setup so each test obtains a fresh module
instance (then access OnnxEmbeddingProvider from the imported module) and verify
embed() / embedBatch() calls use the freshly imported provider to avoid
singleton state leakage.
src/store/vector-store.ts-117-127 (1)

117-127: 🛠️ Refactor suggestion | 🟠 Major

Redundant branching: both paths perform the same operation.

The if (nodeIds.length >= 100) branch and the subsequent code both execute the same logic: create a filter and call table.delete(filter). This appears to be leftover from a refactor.

♻️ Proposed fix: Remove redundant branching
   async deleteByNodeIds(nodeIds: string[]): Promise<void> {
     if (nodeIds.length === 0) {
       return;
     }

     const table = await this.getTable();
     if (!table) {
       return;
     }

-    if (nodeIds.length >= 100) {
-      // For large deletes, use native delete — much faster than reading all rows
-      const filter = createNodeIdFilter(nodeIds);
-      await table.delete(filter);
-      return;
-    }
-
-    // For small deletes, still use native delete
     const filter = createNodeIdFilter(nodeIds);
     await table.delete(filter);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/store/vector-store.ts` around lines 117 - 127, The branch checking
nodeIds.length >= 100 is redundant because both branches call
createNodeIdFilter(nodeIds) and await table.delete(filter); remove the if/else
and simplify the method by unifying the logic: always call
createNodeIdFilter(nodeIds) and then await table.delete(filter) (references:
nodeIds, createNodeIdFilter, table.delete) so there’s a single path for deletes.
scripts/coderag-mcp-discover.js-14-19 (1)

14-19: ⚠️ Potential issue | 🟠 Major

Use import.meta.resolve() instead of require.resolve() in pure ESM context.

This file is pure ESM ("type": "module" in package.json), so require is not defined. The require.resolve() call at line 15 will throw ReferenceError: require is not defined. While the try-catch masks this, the code should use import.meta.resolve() (the ESM equivalent) rather than relying on silent error suppression. If CommonJS support is intentional, use createRequire from "module".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/coderag-mcp-discover.js` around lines 14 - 19, The current
synchronous call to require.resolve("@abhinav2203/coderag/package.json") will
fail in this ESM module; replace it with the ESM equivalent by using
import.meta.resolve("@abhinav2203/coderag/package.json") (and await it or use
top-level await), then use path.dirname on the resolved URL to build cli and
keep the fs.existsSync check; alternatively, if you need CommonJS semantics,
create a require via createRequire from "module"
(createRequire(import.meta.url).resolve(...)) and use that instead of
require.resolve—update the surrounding function to be async if you choose
import.meta.resolve.
src/cli/setup-wizard.ts-195-203 (1)

195-203: ⚠️ Potential issue | 🟠 Major

Don't persist secrets into coderag.config.json or overwrite the whole .env.

llm.apiKey is serialized into the JSON config before the .env step runs, and custom-endpoint keys never make it into .env at all. On top of that, the .env writer replaces the file with only the newly collected keys, which can delete unrelated entries already present in the repo. Keep secrets out of the JSON payload and merge/update targeted .env keys instead.

Also applies to: 213-240

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/setup-wizard.ts` around lines 195 - 203, The llm.apiKey (and any
secrets like custom endpoint keys) are being included in the JSON config and
overwritten into the .env by replacing the whole file; instead omit secrets from
the JSON llm payload (remove apiKey/baseUrl-secret fields from the object
constructed for serialization) and persist them only into environment variables:
during the `.env` step read the existing .env, merge/update only the targeted
keys (e.g., LLM_API_KEY, LLM_BASE_URL, any provider-specific keys) rather than
replacing the file, and write back the merged content; apply this same change to
the other LLM-related construction/serialization spots (the llm object creation
and the subsequent environment-write logic referenced around lines 213-240) so
secrets never land in coderag.config.json and unrelated .env entries are
preserved.
src/cli/setup-wizard.ts-55-56 (1)

55-56: ⚠️ Potential issue | 🟠 Major

Rerunning setup currently resets an existing config to hard-coded defaults.

detectExistingConfig() is called, but none of its values feed the prompt defaults or guard the write path. Running coderag setup in an already-configured repo will silently drop custom retrieval, multi-hop, service, and header settings unless the user re-enters everything by hand.

Also applies to: 161-209

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/setup-wizard.ts` around lines 55 - 56, detectExistingConfig() is
called but its values are not used, so re-running setup overwrites custom fields
with hard-coded defaults; update the prompt/default logic in setup-wizard.ts to
seed prompts with values from detectExistingConfig() (use the returned object to
populate default answers for retrieval, multi-hop, service, and headers) and
change the write path (the code that saves the new config) to merge the existing
config with only the fields the user actually changed or to prompt/require
confirmation before overwriting; reference detectExistingConfig(), the
prompt/default variables and the save/write function in this file to implement
merging or conditional overwrite so existing custom settings are preserved
unless explicitly changed by the user.
src/service/config.ts-72-84 (1)

72-84: ⚠️ Potential issue | 🟠 Major

Avoid mutating process.env during config load.

loadDotEnv() writes repo-local entries into the process-global environment and only does it once per key, so loading two repos in the same host can make the second config depend on whichever .env was loaded first. Keep .env values in a local overlay and merge them into this load instead. Based on learnings "Repo analysis, embeddings, LLM transport, and persistence should stay behind focused interfaces so other hosts can plug CodeRag in later".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/service/config.ts` around lines 72 - 84, loadDotEnv currently mutates the
global process.env which can leak repo-local .env values across loads; change
loadDotEnv to return a local overlay object instead of writing to process.env:
read DOTENV_FILE via readTextFile and parseDotEnv, build a plain record of
entries (using fileExists and parseDotEnv as now) where entries override only
undefined keys within that returned overlay, and update callers to merge this
overlay into their own repo-scoped config or into a provided target map rather
than relying on process.env; keep function name loadDotEnv and reuse
parseDotEnv/fileExists/readTextFile/DOTENV_FILE to locate and produce the
overlay.
src/indexer/gemini-embedder.ts-42-49 (1)

42-49: ⚠️ Potential issue | 🟠 Major

Guard outputDimensionality behind model support.

model is user-configurable via constructor config or environment variable (CODERAG_GEMINI_MODEL), defaulting to models/gemini-embedding-001. However, both buildEmbedRequest() (lines 42-49) and the batch request builder (lines 117-123) unconditionally include outputDimensionality. According to Gemini API documentation, outputDimensionality is not supported when using models/embedding-001; the field cannot be set for that model. Users who override the model to an older variant will trigger API errors. Either validate the selected model supports dimension reduction at initialization, or conditionally include the field only for compatible models.

Additionally, the batch response handler (line 145) returns emb.values ?? [] for missing embedding values, silently padding results with empty arrays. This violates the HTTP response validation requirement and can corrupt downstream indexing. Validate that all returned embeddings contain values before returning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/indexer/gemini-embedder.ts` around lines 42 - 49, Guard inclusion of
outputDimensionality based on the configured model and validate batch responses:
update buildEmbedRequest (and the batch request builder) to only include
outputDimensionality when the selected model supports dimensionality (e.g.,
check this.model or config for a Gemini embedding model that permits
outputDimensionality and omit the field for models like models/embedding-001),
and change the batch response handling (the code that currently uses emb.values
?? []) to explicitly verify each response has a non-empty values array and throw
or return a clear error if any embedding is missing or malformed instead of
silently returning an empty array.
🧹 Nitpick comments (12)
.qwen/reasoning/quality-gates/pre-commit-20260406-183229/COMMIT_PLAN.md (1)

1-81: Consider excluding generated quality-gate session artifacts from version control.

This looks like a run-specific generated file (pre-commit-<timestamp>). If it isn’t intended as permanent documentation, add this path pattern to .gitignore (or write it to a temp/report location) to reduce repo noise.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/pre-commit-20260406-183229/COMMIT_PLAN.md
around lines 1 - 81, The generated per-run quality-gate artifact COMMIT_PLAN.md
(created by the "pre-commit atomic stager hook" and using names like
pre-commit-<timestamp>) should be excluded from version control; add a gitignore
rule such as pre-commit-* or .qwen/reasoning/quality-gates/** to .gitignore (or
change the generator to write these reports to a temp/report directory) so these
transient, run-specific files are not committed; update the generator
configuration or the CI/pre-commit hook that writes COMMIT_PLAN.md to point at
the new temp location if applicable.
.qwen/reasoning/quality-gates/post-impl-20260407-154204/IMPLEMENTATION_REPORT.md (1)

1-28: Avoid committing volatile generated quality-gate artifacts.

This report is run-specific (timestamp + status) and will quickly go stale/noisy in git history. Prefer generating it in CI artifacts or excluding this folder from version control.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-impl-20260407-154204/IMPLEMENTATION_REPORT.md
around lines 1 - 28, The generated IMPLEMENTATION_REPORT.md (produced by the
post-implementation hook) is a volatile, run-specific artifact and should not be
committed; remove the committed IMPLEMENTATION_REPORT.md from the repo, add an
appropriate ignore pattern to .gitignore to exclude the generator output, and
update CI to produce/upload the report as a build artifact instead of committing
it; also ensure the post-implementation hook still runs but writes to the CI
workspace/artifacts, not the repository (refer to the "IMPLEMENTATION_REPORT.md"
file and the "post-implementation hook" mention to locate the generator).
.qwen/reasoning/quality-gates/pre-commit-20260407-122928/COMMIT_PLAN.md (1)

1-81: Consider excluding session-generated quality-gate artifacts from PRs.

This appears to be an ephemeral workflow artifact, not product code/docs. Keeping these files out of commits (or ignoring this path) will reduce review noise.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/pre-commit-20260407-122928/COMMIT_PLAN.md
around lines 1 - 81, The COMMIT_PLAN.md under .qwen/reasoning/... is an
ephemeral pre-commit atomic stager output and should be excluded from PRs;
update the repository ignore rules or the pre-commit config to prevent
committing these artifacts by adding the path pattern ".qwen/reasoning/**" (or
specifically ".qwen/reasoning/quality-gates/**") to .gitignore or to the
pre-commit hook configuration invoked by the "pre-commit atomic stager hook" so
future runs do not stage COMMIT_PLAN.md and similar session-generated files.
.qwen/reasoning/quality-gates/pre-commit-20260407-154209/COMMIT_PLAN.md (1)

1-81: Consider excluding generated quality-gate artifacts from version control.

This file appears to be an auto-generated template artifact from a pre-commit hook run (timestamped directory: pre-commit-20260407-154209). It contains an empty commit plan with 0 files and 0 proposed commits.

Committing timestamped generated artifacts can lead to repository bloat and potential merge conflicts. Unless these artifacts serve a specific audit or compliance purpose, consider adding the .qwen/reasoning/quality-gates/ directory (or at least the timestamped subdirectories) to .gitignore.

🗑️ Suggested .gitignore entry
+# Auto-generated quality-gate artifacts
+.qwen/reasoning/quality-gates/*/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/pre-commit-20260407-154209/COMMIT_PLAN.md
around lines 1 - 81, The generated COMMIT_PLAN.md artifact should not be
committed; remove COMMIT_PLAN.md from the staged changes and add a .gitignore
rule to exclude generated pre-commit quality-gate artifacts (e.g., a pattern
matching the timestamped pre-commit directories such as pre-commit-*/ or the
quality-gate folder), then commit the .gitignore update; if you need to keep
specific reports for audit, move those outputs to a dedicated tracked audit
directory or document the retention policy instead.
.qwen/reasoning/quality-gates/pre-commit-20260406-184327/COMMIT_PLAN.md (1)

84-98: Consider excluding transient session state from committed commit plans.

Current Status and session Commit History are quickly stale. Keeping only reusable planning content will make this doc more durable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/pre-commit-20260406-184327/COMMIT_PLAN.md
around lines 84 - 98, The COMMIT_PLAN.md currently includes transient
session-specific sections ("Current Status", "Commit History", and the
"Remaining Groups to Commit: GROUP_TESTS") that quickly become stale; remove or
move these ephemeral entries out of the committed document and retain only
durable planning content (e.g., the reusable steps, group definitions, and
commit rationale). Update the document to strip or template the "Current Status"
and "Commit History" blocks (or replace them with a generated/log-only section
excluded from commits) and ensure any references to GROUP_TESTS are preserved in
the durable plan rather than as a transient status line.
.qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-04-run-tests.md (1)

5-194: Prefer storing a summarized/sanitized test report instead of full raw logs.

Committing full runner output here creates high-churn diffs and noisy history. Consider persisting only stable totals (files/tests/duration/status) and linking raw logs from CI artifacts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-04-run-tests.md
around lines 5 - 194, The committed file contains full raw test runner logs;
replace that with a concise summary object (Test Files total, Tests total, Start
time, Duration, Status) and a short human-readable sentence linking to the CI
artifact for full logs. Update the
.qwen/reasoning/quality-gates/post-commit-20260406-183933/stage-04-run-tests.md
content to remove the large verbatim block and instead emit the stable totals
(e.g. "Test Files", "Tests", "Duration", "Start at") and a URL or artifact
reference; ensure any tooling that generates this file (or commit step) writes
only the summarized fields and a CI artifact link so the raw logs remain in CI
artifacts rather than git history.
src/store/manifest-store.ts (1)

3-4: Tie loadOptionalJson()'s schema type to its return type.

ZodTypeAny plus as Value throws away the compile-time guarantee this helper is supposed to centralize. A future caller can pair the wrong schema with Value and still compile.

♻️ Suggested refactor
-import { z, type ZodTypeAny } from "zod";
+import { z, type ZodSchema } from "zod";
@@
-const loadOptionalJson = async <Value>(filePath: string, schema: ZodTypeAny): Promise<Value | null> => {
+const loadOptionalJson = async <Value>(filePath: string, schema: ZodSchema<Value>): Promise<Value | null> => {
@@
-    return schema.parse(await readJson<unknown>(filePath)) as Value;
+    return schema.parse(await readJson<unknown>(filePath));

Also applies to: 22-29

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/store/manifest-store.ts` around lines 3 - 4, The helper loadOptionalJson
currently accepts a ZodTypeAny and casts its result to Value, losing
compile-time guarantees; change its signature to be generic like <S extends
ZodTypeAny>(schema: S, ...) and make its return type Promise<z.infer<S> |
undefined>, then use schema.parse/schema.safeParse (or schema.parseAsync) and
return the correctly typed parsed value instead of casting to Value; update all
callers to let the compiler infer z.infer<S> from the passed schema and remove
the "as Value" cast in loadOptionalJson and related functions (e.g., any wrapper
that previously returned Value).
src/llm/prompt.ts (1)

163-168: Nodes with out-of-bounds subQuestionIndex will be silently dropped.

If a node has subQuestionIndex greater than or equal to subQuestions.length, its data won't appear in any sub-question section. This could happen if the upstream context builder has a bug or if subQuestions is unexpectedly truncated.

Consider logging a warning or including orphaned nodes in a separate section to ensure no retrieved context is silently lost.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm/prompt.ts` around lines 163 - 168, subQuestionSections currently maps
nodeByIndex only for indices < subQuestions.length, silently dropping nodes
whose subQuestionIndex is out-of-bounds; update the logic around
subQuestionSections/nodeByIndex/formatSubQuestionSection to detect any entries
where the key >= subQuestions.length, log a warning (e.g., processLogger.warn or
console.warn) naming the offending subQuestionIndex and node ids, and append
those orphaned nodes into the output (either by creating an "Orphaned context"
section via formatSubQuestionSection with a sentinel label or by concatenating a
separate section generated from the orphaned node arrays) so no retrieved
context is lost.
src/mcp/server.ts (1)

12-31: Consider surfacing indexing failures gracefully.

If coderag.index() or coderag.reindex() throws, serveStdioMcpServer will reject and the MCP server won't start. This may be intentional (fail-fast), but if the goal is to start the server even when indexing fails, consider wrapping these calls in a try/catch and logging the error while allowing the server to proceed with potentially stale/missing data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mcp/server.ts` around lines 12 - 31, ensureIndexIsCurrent currently calls
coderag.index() and coderag.reindex() directly which will let errors bubble up
and prevent serveStdioMcpServer from starting; wrap the calls to coderag.index()
and coderag.reindex({ full: true }) in try/catch blocks inside
ensureIndexIsCurrent, log any caught errors with logger?.error (including the
error details and context), and allow the function to return without rethrowing
so serveStdioMcpServer can continue startup with potentially stale/missing
index.
src/retrieval/multi-hop.ts (2)

74-80: Unbounded parallelism may overwhelm resources for many sub-questions.

Promise.all executes all sub-question retrievals concurrently without limiting concurrency. If maxSubQuestions is set high (up to 10 per config), this could strain embedding providers or vector stores with rate limits.

Consider using a concurrency limiter (e.g., p-limit) or processing in smaller batches.

♻️ Example with bounded concurrency
+import pLimit from "p-limit";
+
+const DEFAULT_CONCURRENCY = 3;
+
 export const parallelRetrieve = async (
   subQuestions: string[],
   documents: Record<string, IndexedNodeDocument>,
   embeddingProvider: EmbeddingProvider,
   retrieval: RetrievalConfig,
   snapshot: GraphSnapshot,
   vectorStore: VectorStore | undefined,
-  expansionDepth: number
+  expansionDepth: number,
+  concurrency = DEFAULT_CONCURRENCY
 ): Promise<SubQuestionRetrievalResult[]> => {
+  const limit = pLimit(concurrency);
   const results = await Promise.all(
-    subQuestions.map((sq) =>
-      retrieveForSubQuestion(sq, documents, embeddingProvider, retrieval, snapshot, vectorStore, expansionDepth)
+    subQuestions.map((sq) =>
+      limit(() => retrieveForSubQuestion(sq, documents, embeddingProvider, retrieval, snapshot, vectorStore, expansionDepth))
     )
   );
   return results;
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/retrieval/multi-hop.ts` around lines 74 - 80, The current code uses
Promise.all over subQuestions which can create unbounded parallelism and
overwhelm external services; replace the direct Promise.all call with a
bounded-concurrency approach (e.g., use p-limit or process subQuestions in
configurable batches) so that calls to retrieveForSubQuestion are run with a max
concurrency (introduce or use an existing maxConcurrency or
maxSubQuestionConcurrency config); wrap each call to retrieveForSubQuestion(sq,
...) with the limiter (or batch loop) to preserve return ordering and then await
all limited promises before returning the results.

100-126: deduplicatedNodes and expandedNodes appear redundant.

Both arrays receive identical nodes via the same logic (lines 104-114). If they serve distinct semantic purposes, consider clarifying with comments. Otherwise, one could be derived from the other or removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/retrieval/multi-hop.ts` around lines 100 - 126, The code is populating
both deduplicatedNodes and expandedNodes with the exact same nodes in the
for-loop (see results processing in the function handling retrieval/multi-hop),
so either remove one or make their semantics explicit: either (A) collapse to a
single array (e.g., keep deduplicatedNodes) and replace all uses of
expandedNodes with that single array, or (B) keep both but document/implement
distinct behavior (e.g., deduplicatedNodes = unique set using seen, while
expandedNodes retains original insertion order or includes duplicates for
expansion steps) and update the loop to only push into deduplicatedNodes when
unseen and into expandedNodes according to the clarified rule; also update any
other references to expandedNodes or deduplicatedNodes elsewhere to match the
chosen change.
.qwen/reasoning/quality-gates/post-commit-20260406-183933/test-context.txt (1)

529-545: LLM auto-detection priority may be unclear when multiple API keys are present.

When LLM is enabled but no baseUrl is set, the code checks OPENROUTER_API_KEY, then OPENAI_API_KEY, then ANTHROPIC_API_KEY in that order. If a user has multiple API keys set, only the first match is used. Consider documenting this precedence or logging which provider was auto-selected.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/post-commit-20260406-183933/test-context.txt
around lines 529 - 545, The auto-detection block that mutates llmConfig
(starting from const llmConfig = { ...config.llm }; and the checks for
OPENROUTER_API_KEY / OPENAI_API_KEY / ANTHROPIC_API_KEY) silently picks the
first matching key; add a clear trace/log message when you set
llmConfig.baseUrl/apiKey/transport to record which provider was auto-selected
(use the project logger if available, otherwise console.warn) and include the
selected provider name and the precedence behavior; also update documentation or
a README to explicitly state the detection order (OPENROUTER -> OPENAI ->
ANTHROPIC) so users know which key wins when multiple are present.

Comment on lines +798 to +801
const createIndexHandler = (coderag: CodeRag): HttpRouteHandler => async (request, response, requestId) => {
const body = await readJsonBody(request, reindexBodySchema);
const result = await coderag.reindex({ full: body.full ?? false });
writeJson(request, response, 200, requestId, { data: result, requestId });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

POST /v1/index is wired to the wrong service method.

The embedded src/service/http.ts handler calls coderag.reindex(...), so /v1/index and /v1/reindex currently do the same thing. A first-time index request should go through coderag.index() instead of reindex semantics.

💡 Suggested fix
 const createIndexHandler = (coderag: CodeRag): HttpRouteHandler => async (request, response, requestId) => {
   const body = await readJsonBody(request, reindexBodySchema);
-  const result = await coderag.reindex({ full: body.full ?? false });
+  const result = await coderag.index();
   writeJson(request, response, 200, requestId, { data: result, requestId });
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const createIndexHandler = (coderag: CodeRag): HttpRouteHandler => async (request, response, requestId) => {
const body = await readJsonBody(request, reindexBodySchema);
const result = await coderag.reindex({ full: body.full ?? false });
writeJson(request, response, 200, requestId, { data: result, requestId });
const createIndexHandler = (coderag: CodeRag): HttpRouteHandler => async (request, response, requestId) => {
const body = await readJsonBody(request, reindexBodySchema);
const result = await coderag.index();
writeJson(request, response, 200, requestId, { data: result, requestId });
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.qwen/reasoning/quality-gates/post-commit-20260406-183933/docs-context.txt
around lines 798 - 801, The POST /v1/index handler (createIndexHandler)
incorrectly calls coderag.reindex(...) causing /v1/index to behave like reindex;
change the call to use coderag.index(...) with the appropriate parameters (use
body.full as needed or remove if index() doesn't accept it), keeping the same
request parsing (readJsonBody with reindexBodySchema) and response path
(writeJson) so the handler invokes the correct service method (coderag.index)
instead of coderag.reindex.

Comment on lines +1 to +5
# Stage 1: Linting & Code Quality

**Status:** FAIL
**Tools Run:** 1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify current TypeScript compilation status
npx tsc --noEmit --pretty

Repository: nehraa/CodeRag

Length of output: 893


Critical: Type-check fails, contradicting PR description claim of "clean" status.

The type-check verification shows exit code 2 with 2 errors, confirming the FAIL status. However, the current errors differ from those documented in the markdown report:

Current TypeScript Errors:

  • TS2688: Cannot find type definition file for 'node'
  • TS2688: Cannot find type definition file for 'vitest/globals'

Both are missing from the type library configuration in tsconfig.json (line 18). The markdown report's references to Embedder type errors in test-embedder-config.ts appear to be from an earlier check run and are now outdated.

Resolve the missing type definitions to unblock the build. The PR objectives claim is premature.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
@.qwen/reasoning/quality-gates/post-commit-20260406-184329/stage-01-linting.md
around lines 1 - 5, The build fails due to missing type definitions: update the
tsconfig.json "types" configuration to include "node" and "vitest/globals" (the
TS error TS2688 mentions these), and ensure the corresponding type packages are
available (e.g., add `@types/node` and vitest types as devDependencies or confirm
they are installed) so the TypeScript compiler can resolve those libs; modify
the "types" array entry and/or package.json devDependencies and re-run the
type-check to confirm the exit code becomes 0.

Comment thread coderag.config.json
Comment on lines +17 to +23
"llm": {
"enabled": true,
"transport": "openai-compatible",
"baseUrl": "https://openrouter.ai/api/v1",
"model": "stepfun/step-3.5-flash",
"apiKey": "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",
"timeoutMs": 45000

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove the committed OpenRouter credential before merge.

Line 22 contains a real-looking API key in a tracked file. That exposes a paid secret in repo history and every clone. Revoke or rotate it, then load the key from env or an untracked local config instead.

🔐 Suggested change
   "llm": {
     "enabled": true,
     "transport": "openai-compatible",
     "baseUrl": "https://openrouter.ai/api/v1",
     "model": "stepfun/step-3.5-flash",
-    "apiKey": "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",
     "timeoutMs": 45000
   }
🧰 Tools
🪛 Betterleaks (1.1.1)

[high] 22-22: Detected an OpenRouter API Key, which may expose access to multiple AI models through the OpenRouter gateway.

(openrouter-api-key)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@coderag.config.json` around lines 17 - 23, The committed file contains a real
API key in the llm.apiKey field; remove the secret from coderag.config.json,
revoke/rotate the exposed key, and change the configuration to load the key from
an environment variable or an untracked local config instead (update any code
that reads the llm config to read process.env.OPENROUTER_API_KEY or equivalent),
ensuring the llm object still has a placeholder reference but no hard-coded
secret.

76 reasoning/quality-gate files were accidentally tracked
despite .gitignore rule. Removed with --cached to keep
local copies but untrack them.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
@nehraa nehraa merged commit 833db87 into main Apr 22, 2026
1 check passed
@nehraa nehraa deleted the feat/gemini-onnx-embedding-providers branch April 22, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants