Memory System

CortexPrism features a sophisticated 5-tier memory architecture with hybrid retrieval combining FTS5 keyword search and cosine vector similarity, scored with exponential time decay.

Memory Architecture

retrieve(query, embedder)
  │
  ├── keywordSearch(query)   → FTS5 BM25 over episodic_memory + semantic_memory
  │
  ├── vectorSearch(embed)    → cosine similarity over stored embeddings
  │       embedding = embedder.embed(query) if embedder available
  │
  └── merge + decay-score
        score = raw_score × 2^(-age_days / half_life_days)
        sorted descending → top-K

Memory Tiers

TierDatabase TablePersistenceContent
Ephemeral (T1)In-memorySession onlyCurrent conversation context
Episodic (T2)episodic_memorySession crossTurn summaries of user+agent exchanges
Semantic (T3)semantic_memoryLong-termInjected facts and knowledge
Archival (T4)Compressed historyPermanentCompressed/aggregated historical data
Reflection (T5)reflection_memoryPermanentLLM-extracted behavior patterns

Storage

TableTypeContents
episodic_memoryRowTurn summaries — user+agent exchanges
semantic_memoryRowInjected facts / knowledge
reflection_memoryRowLLM-extracted behaviour patterns
episodic_memory_ftsFTS5Virtual table for keyword search on episodic
semantic_memory_ftsFTS5Virtual table for keyword search on semantic

Retrieval Algorithm

1. FTS5 keyword search across episodic_memory and semantic_memory
   → BM25 ranking scores

2. Cosine vector similarity (if embedder available)
   → embedding = embedder.embed(query)
   → cosine similarity against stored embeddings

3. Merge + decay scoring
   → score = raw_score × 2^(-age_days / half_life_days)
   → half_life_days: configurable per tier (default: 7 days)

4. Sort descending → return top-K results

Embedding Providers

ProviderBackendModelUse Case
OllamaEmbedderOllama /api/embeddingsConfigurableLocal embedding generation
OpenAIEmbedderOpenAI APItext-embedding-3-smallCloud embedding generation
StubEmbedderDeterministic hashNo external service needed (default)

Memory Injection

injectMemory(systemPrompt, hits) prepends retrieved content to the system prompt:

--- Relevant Memory ---
[episodic] 2026-06-14: User: ... Assistant: ...
[semantic] CortexPrism uses SQLite WAL mode
---

This ensures the agent has relevant context from past interactions without exceeding token limits.

Operations

  • store: Save information to a memory tier (episodic or semantic)
  • search: Query memories across tiers with hybrid retrieval
  • retrieve: Get specific memories by ID
  • forget: Remove memories
  • compress: Summarize and archive older memories