Skip to content

Claude Code Plugin

Automatic persistent memory for Claude Code. No commands to learn, no manual saving -- just install the plugin and Claude remembers what you worked on across sessions.

The plugin is built entirely on Claude Code's own primitives: Hooks for lifecycle events, Skills for intelligent retrieval, and CLI for tool access. No MCP servers, no sidecar services, no extra network round-trips.


Without vs. With the Plugin

Every Claude Code session starts with a blank slate. Close a session and the context is gone -- Claude has no idea what you discussed yesterday, what decisions you made, or what code you touched. The memsearch plugin changes this fundamentally.

sequenceDiagram
    participant You
    participant Claude as Claude Code

    rect rgb(60, 30, 30)
    note right of You: Without plugin
    You->>Claude: Monday: "Add Redis caching with 5min TTL"
    Claude->>You: Done
    note over Claude: Session ends. Context is gone.
    You->>Claude: Wednesday: "The /orders endpoint is slow"
    Claude->>You: Suggests solutions from scratch<br/>(forgot about the Redis cache from Monday)
    end

    rect rgb(20, 50, 30)
    note right of You: With plugin
    You->>Claude: Monday: "Add Redis caching with 5min TTL"
    Claude->>You: Done
    note over Claude: Plugin auto-summarizes to memory/2026-02-10.md
    You->>Claude: Wednesday: "The /orders endpoint is slow"
    note over Claude: Plugin injects: "Added Redis caching with 5min TTL..."
    Claude->>You: "We already have Redis caching --<br/>let me add /orders to it"
    end

Without the plugin, Claude treats every session as independent. You end up re-explaining context, re-describing past decisions, and watching Claude suggest solutions you already tried. This is especially painful on long-running projects where architectural decisions accumulate over weeks.

With the plugin, every conversation is automatically summarized and indexed. When you ask a question that benefits from historical context, Claude autonomously searches past sessions and retrieves relevant memories -- no manual intervention required. The result is a Claude that builds on its own past work instead of starting from scratch.


When Is This Useful?

  • Picking up where you left off. You debugged an auth issue yesterday but didn't finish. Today Claude remembers the root cause, which files you touched, and what you tried -- no re-explaining needed.

  • Recalling past decisions. "Why did we switch from JWT to session cookies?" Claude can trace back to the original conversation where the trade-offs were discussed, thanks to the 3-layer progressive disclosure that drills from summary to full section to original transcript.

  • Long-running projects. Over days or weeks of development, architectural context accumulates automatically. Claude stays aware of your codebase conventions, past refactors, and resolved issues without you having to maintain a manual changelog.

  • Multi-session debugging. A bug that spans multiple sessions -- first investigation, then a failed fix, then the real fix -- is fully tracked. Claude can recall the entire debugging timeline and avoid repeating failed approaches.

  • Onboarding and handoff. When a new team member picks up a project, the accumulated memory provides a narrative of what was built, why, and what was tried along the way.


Key Differentiators

memsearch
Zero intervention Capture and recall are fully automatic -- no commands, no manual saves
Forked subagent recall Memory search runs in an isolated context: fork, keeping your main conversation clean
Hybrid search Dense vectors + BM25 sparse search fused with RRF -- better recall than vector-only
Transparent storage Plain .md files you can read, edit, grep, and commit to git
No API key required ONNX bge-m3 runs locally on CPU by default
No MCP overhead Pure hooks + skills -- no tool definitions consuming context tokens

Key Features

  • Zero-config capture -- conversations are automatically summarized and saved after each turn
  • Semantic recall -- Claude automatically searches past sessions when your question needs historical context
  • Three-layer progressive disclosure -- search, expand, and drill into original transcripts (details)
  • Forked subagent -- memory recall runs in an isolated context, keeping your main conversation clean
  • ONNX embedding by default -- no API key required, runs locally on CPU
  • Markdown is the source of truth -- human-readable, git-friendly, portable (details)

Pages

  • Installation -- install from marketplace or source, first-time setup
  • How It Works -- architecture, hooks, capture mechanism, memory storage
  • Memory Recall -- three-layer progressive disclosure, comparisons, tips
  • Troubleshooting -- debug mode, common issues, diagnostic commands

Comparison with claude-mem

claude-mem is another memory solution for Claude Code. Both projects solve the same problem -- giving Claude persistent memory across sessions -- but take fundamentally different architectural approaches.

Aspect memsearch claude-mem
Architecture 4 shell hooks + 1 skill + 1 watch process 5 JS hooks + 1 skill + MCP tools + Express worker service (port 37777)
Memory recall Skill in forked subagent -- intermediate results stay isolated Skill + MCP hybrid -- tool definitions permanently consume context tokens
Session capture 1 async claude -p --model haiku call at session end AI observation compression on every tool use (PostToolUse hook)
Vector backend Milvus -- hybrid search (dense + BM25 + RRF) ChromaDB -- dense only; SQLite FTS5 for keyword search (separate, not fused)
Embedding model Pluggable: OpenAI, Google, Voyage, Ollama, ONNX (default: bge-m3 int8) Fixed: all-MiniLM-L6-v2 (384-dim, WASM backend)
Storage format Transparent .md files -- human-readable, git-friendly SQLite database + ChromaDB binary
Data portability Copy .memsearch/memory/*.md and rebuild index Export from SQLite + ChromaDB
Runtime dependency Python (memsearch CLI) + claude CLI Node.js / Bun + Express worker service
Context window cost No MCP tool definitions; skill runs in forked context -- only curated summary enters main context MCP tool definitions permanently loaded + each MCP tool call/result consumes main context

The Key Difference: Forked Subagent vs. MCP Tools

Both projects use hooks for session lifecycle and skills for memory recall. The architectural divergence is in how retrieval interacts with the main context window.

memsearch runs memory recall in a forked subagent (context: fork). The memory-recall skill gets its own isolated context window -- all search, expand, and transcript operations happen there. Only the curated summary is returned to the main conversation. This means: (1) intermediate search results never pollute the main context, (2) multi-step retrieval is autonomous, and (3) no MCP tool definitions consume context tokens.

claude-mem combines a mem-search skill with MCP tools (search, timeline, get_observations, save_memory). The MCP tools give Claude explicit control over memory access in the main conversation, at the cost of tool definitions permanently consuming context tokens. The PostToolUse hook also records every tool call as an observation, providing richer per-action granularity but incurring more API calls.

The other key difference is storage philosophy: memsearch treats markdown files as the source of truth (human-readable, git-friendly, rebuildable), while claude-mem uses SQLite + ChromaDB (opaque but structured, with richer queryable metadata).


Comparison with Claude's Native Memory

Claude Code has built-in memory features: CLAUDE.md files for project instructions and auto-memory (~/.claude/projects/.../memory/) for remembered facts. Here is why memsearch provides a stronger solution:

Aspect Claude Native Memory memsearch
Storage Single CLAUDE.md + small per-project memory files Unlimited daily .md files with full session history
Recall mechanism File loaded at session start (no search) Skill-based semantic search -- Claude auto-invokes when context is needed
Granularity One monolithic file, manually edited Per-session bullet points, automatically generated
Search None -- Claude reads the whole file or nothing Hybrid semantic search (dense + BM25) returning top-k relevant chunks
History depth Limited to what fits in one file Unlimited -- every session is logged, every entry is searchable
Automatic capture /memory command requires manual intervention Fully automatic -- hooks capture every session
Progressive disclosure None -- entire file loaded into context 3-layer model (search, expand, transcript) minimizes context usage
Deduplication Manual -- user must avoid adding duplicates SHA-256 content hashing prevents duplicate embeddings
Portability Tied to Claude Code's internal format Standard markdown files, usable with any tool or platform

Why This Matters

CLAUDE.md is a blunt instrument: it loads the entire file into context at session start, regardless of relevance. As the file grows, it wastes context window on irrelevant information and eventually hits size limits. There is no search -- Claude cannot selectively recall a specific decision from three weeks ago.

Claude's auto-memory (the ~/.claude/projects/.../memory/ system) is better but still limited. It stores discrete facts, not session narratives. It has no semantic search -- memories are loaded based on recency, not relevance. And it only works within Claude Code, so memories are not portable to other platforms.

memsearch solves this with skill-based semantic search and progressive disclosure. When Claude judges that historical context would help, it auto-invokes the memory-recall skill, which runs in a forked subagent and autonomously searches, expands, and curates relevant memories. History can grow indefinitely without degrading performance, because the vector index handles the filtering. And the three-layer model (search → expand → transcript) runs entirely in the subagent, keeping the main context window clean.