Remind - Development Guide for AI Agents

This guide is for AI agents developing Remind itself. For using Remind as a memory layer, see docs/AGENTS.md.

Project Overview

Remind is a generalization-capable memory layer for LLMs. It differs from simple RAG by extracting and maintaining generalized concepts from episodic experiences, mimicking how human memory consolidates specific events into abstract knowledge.

Core insight: Episodes → Consolidation (LLM-powered "sleep") → Concepts with relations

Architecture

src/remind/
├── models.py          # Data models (Concept, Episode, Entity, Relation, Topic)
├── store.py           # SQLite persistence layer
├── interface.py       # MemoryInterface - main public API
├── config.py          # Configuration loading (config file, env vars, defaults)
├── consolidation.py   # LLM-powered episode → concept transformation
├── extraction.py      # Entity/type extraction from episodes
├── retrieval.py       # Spreading activation retrieval
├── reranker.py        # Optional cross-encoder reranking (requires [rerank] extra)
├── triage.py          # Auto-ingest: buffered intake + density scoring
├── cli.py             # Command-line interface (project-aware)
├── mcp_server.py      # MCP (Model Context Protocol) server
├── background.py      # Background consolidation spawning
├── background_worker.py # Subprocess entry point for background consolidation
├── api/               # REST API for web UI
│   ├── __init__.py    # Exports api_routes
│   └── routes.py      # Starlette route handlers
├── static/            # Web UI assets (compiled)
│   ├── index.html     # Entry point
│   └── assets/        # CSS/JS bundles
└── providers/         # LLM and embedding provider implementations
    ├── base.py        # Abstract base classes
    ├── anthropic.py   # Claude
    ├── openai.py      # OpenAI
    ├── azure_openai.py # Azure OpenAI
    └── ollama.py      # Local models via Ollama

Key Abstractions

Data Models (`models.py`)

Model	Purpose
`Episode`	Raw experience/interaction. Temporary, gets consolidated.
`Concept`	Generalized knowledge with confidence, relations, conditions. Has `concept_type`: `pattern`, `fact_cluster`, or `legacy`.
`ConceptType`	Enum: `LEGACY`, `PATTERN`, `FACT_CLUSTER`. Determines how concepts are created and displayed.
`Entity`	External referent (file, person, concept, tool). Format: `type:name`
`Relation`	Typed edge between concepts (implies, contradicts, specializes, etc.)
`Topic`	Named knowledge area grouping episodes/concepts. Has id (slug), name, description.

Episode types: observation, decision, question, meta, preference, outcome, fact

Episode lifecycle: Created via remember() or ingest() → Entity extraction → Consolidation → Marked consolidated

Fact episodes: Specific factual assertions (config values, names, dates, concrete technical details). Consolidation preserves fact details verbatim in concept summaries rather than generalizing them away.

Outcome episode metadata: strategy (approach used), result (success/failure/partial), prediction_error (low/medium/high)

Entity ID format: type:name (e.g., file:src/auth.ts, person:alice, concept:caching)

Providers (`providers/base.py`)

Two abstract base classes:

class LLMProvider(ABC):
    async def complete(prompt, system, temperature, max_tokens) -> str
    async def complete_json(prompt, system, temperature, max_tokens) -> dict
    @property name: str

class EmbeddingProvider(ABC):
    async def embed(text) -> list[float]
    async def embed_batch(texts) -> list[list[float]]
    @property dimensions: int
    @property name: str

Adding a new provider: Implement these interfaces. See ollama.py for a complete example with error handling.

Memory Interface (`interface.py`)

The main entry point. Key design decisions:

remember() is synchronous and fast - no LLM calls, just stores episode
ingest() is async with LLM triage - buffers, extracts episodes, immediately consolidates
consolidate() does all LLM work - extraction + generalization
Two consolidation modes: Automatic (threshold-based) or manual (hook-based)

# Consolidation happens in two phases:
# 1. Extract entities/types from unextracted episodes
# 2. Generalize episodes into concepts (the "sleep" process)

Consolidation (`consolidation.py`)

The "brain" of the system. Uses dual-track processing:

Fact track (no LLM generalization):

Fact episodes are clustered by shared entity
Existing fact_clusters are looked up via entity recall
Facts are preserved verbatim in specifics field
Conflicts are detected and flagged (not auto-resolved)

Pattern track (LLM generalization):

Classify episode types (observation, decision, question, meta, preference, outcome)
Extract entity mentions from natural language
Identify patterns across episodes
Create generalized concepts with relations
Detect contradictions
Identify causal patterns in outcome episodes (strategy-outcome relations)

Auto-Ingest Triage (`triage.py`)

The "input selection" subsystem. Two classes:

IngestionBuffer — character-threshold accumulator. Buffers raw text until threshold (~4000 chars).
IngestionTriager — LLM-based episode extraction. The LLM decides directly what's worth remembering, extracts distilled episodes, and detects action-result pairs as outcome episodes. A density score (0.0-1.0) is produced for diagnostics but doesn't gate extraction.

Topics: Topics are explicit and optional. When ingest() is called with an explicit topic, all extracted episodes are stamped with that topic. When topic is omitted, episodes get topic_id=None. Sub-chunks are triaged concurrently, bounded by llm_concurrency.

Instructions: The instructions parameter (optional) lets callers steer what the triage LLM extracts. When provided, instructions are appended to the triage system prompt as a prioritized directive. This enables focused ingestion (e.g. "extract only architectural decisions", "capture all config values"). Threaded through the full pipeline: ingest()/flush_ingest() → _process_ingest_chunk() → _triage_sub_chunk() → IngestionTriager.triage(). Also serialized in the background queue JSON payload for CLI background workers.

Pipeline: ingest() → buffer → triage + extract (LLM) → remember() → consolidate(force=True)

Retrieval (`retrieval.py`)

Hybrid recall with spreading activation + entity-based episode retrieval:

Query is embedded and matched to concepts via cosine similarity (native vector index when available)
Embedding scores are optionally fused with keyword overlap scores (hybrid_keyword_weight config, default 0.3)
Matched concepts activate related concepts through the graph
Activation spreads with decay over multiple hops
Optional cross-encoder reranking blends activation scores with query-document relevance (reranking_enabled config, requires [rerank] extra)
Highest-activation concepts are returned with source episodes (including type labels and entity context)

Key class: MemoryRetriever with ActivatedConcept results.

Helper function: _keyword_score(query, text) — normalized token overlap for hybrid scoring.

Reranking (reranker.py): Optional cross-encoder post-processing. When reranking_enabled=True, a Reranker wrapping sentence-transformers CrossEncoder rescores candidates after spreading activation. Blending: 0.4 × activation + 0.6 × rerank_score. Model is lazy-loaded on first recall. Requires pip install "remind-mcp[rerank]".

Store (`store.py`)

Multi-database persistence via SQLAlchemy Core. Supports SQLite (default), PostgreSQL, and MySQL. Tables:

concepts - Stores concepts with JSON-serialized relations
episodes - Raw episodes with consolidation status
entities - Entity registry
mentions - Episode-entity junction table
relations - Concept-to-concept graph edges
entity_relations - Entity-to-entity relationships
metadata - Persistent key-value store

Vector search backends (auto-detected at startup):

sqlite-vec (SQLite): vec0 virtual tables (vec_concepts, vec_episodes) with cosine distance KNN. Extension is loaded via sqlite_vec.load() on connection creation.
pgvector (PostgreSQL): vector(N) columns (embedding_vec) with HNSW indexes and <=> cosine distance operator.
Brute-force fallback: If neither extension is available, falls back to Python-side numpy cosine similarity (O(n) per query).

Embedding dimensions are recorded in the metadata table on first write. Vector tables/columns are created lazily when the first embedding is stored.

The MemoryStore ABC defines the interface. SQLAlchemyMemoryStore is the concrete implementation (aliased as SQLiteMemoryStore for backward compatibility).

Configuration (`config.py`)

Centralized configuration management with provider-specific settings:

REMIND_DIR = Path.home() / ".remind"
CONFIG_FILE = REMIND_DIR / "remind.config.json"

@dataclass
class AnthropicConfig:
    api_key: Optional[str] = None
    model: str = "claude-sonnet-4-20250514"
    ingest_model: Optional[str] = None

@dataclass
class OpenAIConfig:
    api_key: Optional[str] = None
    base_url: Optional[str] = None
    model: str = "gpt-4.1"
    embedding_model: str = "text-embedding-3-small"
    ingest_model: Optional[str] = None

@dataclass
class AzureOpenAIConfig:
    api_key: Optional[str] = None
    base_url: Optional[str] = None  # e.g. https://myresource.openai.azure.com (/openai/v1 appended automatically)
    deployment_name: Optional[str] = None
    embedding_deployment_name: Optional[str] = None
    embedding_size: int = 1536
    ingest_deployment_name: Optional[str] = None

@dataclass
class OllamaConfig:
    url: str = "http://localhost:11434"
    llm_model: str = "llama3.2"
    embedding_model: str = "nomic-embed-text"
    ingest_model: Optional[str] = None

@dataclass
class RemindConfig:
 llm_provider: str = "anthropic"
 embedding_provider: str = "openai"
 consolidation_threshold: int = 5
 concepts_per_pass: int = 64
 auto_consolidate: bool = True
 extraction_batch_size: int = 50
 extraction_llm_batch_size: int = 10
 consolidation_batch_size: int = 25
 llm_concurrency: int = 3
 # Database URL (SQLAlchemy format; None = SQLite default)
 db_url: Optional[str] = None
 # Auto-ingest settings
 ingest_buffer_size: int = 4000
 # Retrieval tuning
 hybrid_keyword_weight: float = 0.3
 recall_initial_candidates: int = 10
 # Reranking (requires `pip install "remind-mcp[rerank]"`)
 reranking_enabled: bool = False
 reranking_model: str = "cross-encoder/ms-marco-MiniLM-L-6-v2"
 # Logging
 logging_enabled: bool = False
 # Episode types (controls which types are valid + gates CLI/MCP features)
 episode_types: list[str] = field(default_factory=lambda: list(DEFAULT_EPISODE_TYPES))
 # Nested provider configs
 anthropic: AnthropicConfig
 openai: OpenAIConfig
 azure_openai: AzureOpenAIConfig
 ollama: OllamaConfig

def load_config() -> RemindConfig:
    """Priority: env vars > config file > defaults"""

def resolve_db_url(db_name: Optional[str], project_aware: bool = False) -> str:
    """Resolve a database name/path/URL to a SQLAlchemy database URL.
    - Full URL (postgresql://..., mysql://...): returned as-is
    - db_name provided: sqlite:///~/.remind/{name}.db
    - db_name=None + project_aware=True: sqlite:///<cwd>/.remind/remind.db
    - db_name=None + project_aware=False: sqlite:///~/.remind/memory.db
    """

def resolve_db_path(db_name: Optional[str], project_aware: bool = False) -> str:
    """Legacy function - resolves to a file path (SQLite only)."""

Database URL (db_url): Supports any SQLAlchemy-compatible database URL. Env var: REMIND_DB_URL. Examples:

SQLite (default): sqlite:///~/.remind/memory.db
PostgreSQL: postgresql+psycopg://user:pass@localhost:5432/remind
MySQL: mysql+pymysql://user:pass@localhost:3306/remind

Config file format (~/.remind/remind.config.json):

{
 "llm_provider": "anthropic",
 "embedding_provider": "openai",
 "consolidation_threshold": 5,
 "auto_consolidate": true,
 "hybrid_keyword_weight": 0.3,
 "logging_enabled": false,
 "cli_output_mode": "table",
 "db_url": null,
 "anthropic": { "api_key": "sk-ant-..." },
 "openai": { "api_key": "sk-..." },
 "azure_openai": { "api_key": "...", "base_url": "..." },
 "ollama": { "url": "http://localhost:11434" }
}

cli_output_mode may be table (default), json, or compact-json (minimal id/title/summary for browse commands).

Background Consolidation (`background.py`, `background_worker.py`)

Non-blocking consolidation for CLI:

spawn_background_consolidation() - Spawns a subprocess to consolidate
Uses filelock for cross-platform file locking to prevent concurrent runs
Lock files stored at ~/.remind/.consolidate-{hash}.lock
Logs to ~/.remind/logs/consolidation.log

The CLI automatically triggers background consolidation after remember when the episode threshold is reached.

Code Conventions

Async/Await

All LLM operations are async
remember() is deliberately sync (fast path)
Use asyncio.run() in CLI, native async in MCP server

Type Hints

Full type hints on all public APIs
Use Optional[T] explicitly, not T | None for consistency
Dataclasses with field(default_factory=...) for mutable defaults

Error Handling

Providers handle their own retries and rate limiting
Store operations raise on critical errors, return None for "not found"
Consolidation is fault-tolerant (individual episode failures don't abort)

Logging

Use logging.getLogger(__name__) pattern
Debug for operation traces, Info for consolidation events, Warning for recoverable issues

JSON Serialization

All models have to_dict() and from_dict() class methods
Enums serialize to their string value
Datetimes serialize to ISO format

Testing

Tests are in tests/ using pytest. Key patterns:

# Temporary database fixture
@pytest.fixture
def store():
    fd, path = tempfile.mkstemp(suffix=".db")
    os.close(fd)
    store = SQLiteMemoryStore(path)
    yield store
    os.unlink(path)

Run tests:

pytest                      # All tests
pytest tests/test_store.py  # Specific file
pytest -v                   # Verbose

Note: Tests requiring LLM calls should use mocks or be marked as integration tests.

Common Development Tasks

Adding a New Provider

Create providers/newprovider.py
Implement LLMProvider and/or EmbeddingProvider
Add to providers/__init__.py exports
Add to interface.py factory map in create_memory()
Update CLI in cli.py if needed
Document in README.md

Adding a New Episode Type

Add to EpisodeType enum in models.py
Update extraction prompt in extraction.py
Update consolidation prompts in consolidation.py
Add MCP tool if type-specific querying is useful

Note: episode_types config controls which types are active. Custom types (strings not in EpisodeType enum) are also supported.

Adding a New Entity Type

Add to EntityType enum in models.py
Update extraction prompt in extraction.py
No other changes needed (entities are dynamically typed)

Adding a New Relation Type

Add to RelationType enum in models.py
Update consolidation prompts to use new relation
Consider retrieval implications (spreading activation weights)

Adding a New MCP Tool

Add function to mcp_server.py using FastMCP decorators
Document in docs/AGENTS.md
Test via MCP client

Adding a New REST API Endpoint

Add route handler function to api/routes.py
Add route to api_routes list at bottom of file
Use _get_memory_from_request() helper to get MemoryInterface
Return JSONResponse for data or StreamingResponse for SSE

The REST API uses Starlette and serves the web UI. Endpoints:

GET /api/v1/stats - Memory statistics
GET /api/v1/concepts - Paginated concepts list
GET /api/v1/concepts/{id} - Concept detail with source episodes
GET /api/v1/episodes - Paginated episodes with filters
GET /api/v1/episodes/{id} - Episode detail
GET /api/v1/entities - All entities with mention counts
GET /api/v1/entities/{id} - Entity detail
GET /api/v1/entities/{id}/episodes - Episodes mentioning entity
GET /api/v1/entities/{id}/concepts - Concepts derived from entity's episodes
GET /api/v1/graph - Full concept graph for D3 visualization
POST /api/v1/query - Execute recall query
POST /api/v1/chat - Streaming chat with memory context (SSE)
GET /api/v1/databases - List available databases

Development Setup

# Clone and install in development mode
git clone <repo>
cd remind
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

# Set up environment
cp .env.example .env
# Edit .env with API keys

# Run tests
pytest

# Run CLI
remind --help

# Run MCP server
remind-mcp --port 8765

Using uv (Recommended)

uv is a fast Python package manager that simplifies development:

# Install dependencies and run tests
uv run pytest

# Run CLI commands
uv run remind --help
uv run remind remember "Some observation"
uv run remind recall "query"

# Run MCP server
uv run remind-mcp --port 8765

With uv, you don't need to manually create a virtual environment or install dependencies - it handles everything automatically.

Shell commands

IMPORTANT: Prefix all shell commands, including 'uv' ones with oo. so e.g. 'oo uv run pytest'.

Design Principles

Separation of concerns: Storage, providers, consolidation, retrieval are independent
Fast path for writes: remember() never blocks on LLM
Batch LLM work: Consolidation processes multiple episodes together
Graceful degradation: Missing embeddings fall back to keyword matching
Provider agnostic: Core logic doesn't depend on specific LLM/embedding provider
Explicit over implicit: Episode types/entities can be auto-detected or manually specified

Important Files for Context

When making changes, these files are most commonly modified together:

Change Type	Files
New data structure	`models.py`, `store.py`
New provider	`providers/newprovider.py`, `providers/__init__.py`, `interface.py`
Consolidation logic	`consolidation.py`, `extraction.py`
Retrieval behavior	`retrieval.py`, `reranker.py`, `config.py`
Auto-ingest pipeline	`triage.py`, `llm_protocol.py`, `interface.py`, `config.py`
CLI commands	`cli.py`
Configuration	`config.py`, `interface.py`, `mcp_server.py`, `cli.py`
Background consolidation	`background.py`, `background_worker.py`, `cli.py`
MCP tools	`mcp_server.py`, `docs/AGENTS.md`
REST API endpoints	`api/routes.py`
Public API	`interface.py`, `__init__.py`, `README.md`

Debugging Tips

Consolidation issues: Check episode entities_extracted and consolidated flags
Retrieval misses: Verify concepts have embeddings (embedding field not None)
Entity linking: Entity IDs are case-sensitive, use canonical forms
MCP issues: Check db query parameter in MCP URL, verify server is running
Config issues: Check ~/.remind/remind.config.json exists and is valid JSON
Background consolidation: Check ~/.remind/logs/consolidation.log for errors
CLI database path: Without --db, uses <cwd>/.remind/remind.db (project-aware). Accepts database URLs (e.g. --db postgresql+psycopg://...)

Database Schema

CREATE TABLE concepts (
    id TEXT PRIMARY KEY,
    data JSON NOT NULL  -- Serialized Concept
);

CREATE TABLE episodes (
    id TEXT PRIMARY KEY,
    data JSON NOT NULL  -- Serialized Episode
);

CREATE TABLE entities (
    id TEXT PRIMARY KEY,
    data JSON NOT NULL  -- Serialized Entity
);

CREATE TABLE mentions (
    episode_id TEXT,
    entity_id TEXT,
    PRIMARY KEY (episode_id, entity_id)
);

The store handles JSON serialization/deserialization transparently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remind - Development Guide for AI Agents

Project Overview

Architecture

Key Abstractions

Data Models (`models.py`)

Providers (`providers/base.py`)

Memory Interface (`interface.py`)

Consolidation (`consolidation.py`)

Auto-Ingest Triage (`triage.py`)

Retrieval (`retrieval.py`)

Store (`store.py`)

Configuration (`config.py`)

Background Consolidation (`background.py`, `background_worker.py`)

Code Conventions

Async/Await

Type Hints

Error Handling

Logging

JSON Serialization

Testing

Common Development Tasks

Adding a New Provider

Adding a New Episode Type

Adding a New Entity Type

Adding a New Relation Type

Adding a New MCP Tool

Adding a New REST API Endpoint

Development Setup

Using uv (Recommended)

Shell commands

Design Principles

Important Files for Context

Debugging Tips

Database Schema

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Remind - Development Guide for AI Agents

Project Overview

Architecture

Key Abstractions

Data Models (models.py)

Providers (providers/base.py)

Memory Interface (interface.py)

Consolidation (consolidation.py)

Auto-Ingest Triage (triage.py)

Retrieval (retrieval.py)

Store (store.py)

Configuration (config.py)

Background Consolidation (background.py, background_worker.py)

Code Conventions

Async/Await

Type Hints

Error Handling

Logging

JSON Serialization

Testing

Common Development Tasks

Adding a New Provider

Adding a New Episode Type

Adding a New Entity Type

Adding a New Relation Type

Adding a New MCP Tool

Adding a New REST API Endpoint

Development Setup

Using uv (Recommended)

Shell commands

Design Principles

Important Files for Context

Debugging Tips

Database Schema

Data Models (`models.py`)

Providers (`providers/base.py`)

Memory Interface (`interface.py`)

Consolidation (`consolidation.py`)

Auto-Ingest Triage (`triage.py`)

Retrieval (`retrieval.py`)

Store (`store.py`)

Configuration (`config.py`)

Background Consolidation (`background.py`, `background_worker.py`)