Supermemory: An Open-Source Memory and Context Engine for AI Apps
With 26k+ GitHub stars and top scores on three AI memory benchmarks, Supermemory gives developers a single API for persistent memory, RAG, user profiles, and connectors — no vector DB plumbing required.
Every AI app faces the same wall: LLMs forget everything the moment a conversation ends. Bolting on memory means picking a vector store, wiring up embedding pipelines, deciding on chunking strategies, and hoping you got the retrieval logic right. Supermemory collapses that entire stack into one open-source engine and API — and it's currently trending at 26.2k stars on GitHub.
What It Actually Does
Supermemory describes itself as a "memory and context layer for AI," claiming the #1 spot on LongMemEval, LoCoMo, and ConvoMem — the three major benchmarks for AI memory quality. The core engine handles several things that developers typically wire together by hand:
- Fact extraction from conversations, with logic for temporal changes, contradictions, and automatic forgetting of stale information
- User profiles auto-maintained as a blend of stable facts and recent activity, returned in a single call at around 50ms
- Hybrid search that merges RAG over a knowledge base with personalized memory in one query
- Connectors for Google Drive, Gmail, Notion, OneDrive, and GitHub with real-time webhook sync
- Multi-modal extractors for PDFs, images (OCR), video (transcription), and code (AST-aware chunking)
All of it runs through a single memory structure and ontology — no gluing together five different services.
The Developer API
For builders, integration is a few lines:
npm install supermemory
# or
pip install supermemory
import Supermemory from "supermemory";
const client = new Supermemory();
// Store a fact, scoped to a user
await client.add({
content: "User loves TypeScript and prefers functional patterns",
containerTag: "user_123",
});
// Retrieve profile + relevant memories in one round-trip
const { profile, searchRes } = await client.recall({
query: "what stack does this user prefer?",
containerTag: "user_123",
});
Memory is scoped with containerTag values, letting you partition by user, project, client, or repository without any schema design. The pitch — "No vector DB config. No embedding pipelines. No chunking strategies" — is aimed squarely at developers who want persistent context without becoming infrastructure engineers.
MCP and Tooling Ecosystem
Supermemory ships an MCP server for developers who want to give existing AI assistants persistent memory rather than building a new app:
npx -y install-mcp@latest https://mcp.supermemory.ai/mcp --client claude --oauth=yes
Swap claude for cursor, windsurf, vscode, and so on. Supported clients include Claude Desktop, Cursor, Windsurf, VS Code, Claude Code, OpenCode, OpenClaw, and Hermes. The MCP server exposes three tools: memory (save or explicitly forget information), recall (semantic search over stored memories plus a user-profile summary), and context (inject the full user profile at conversation start — in Cursor and Claude Code, /context triggers it).
Beyond MCP, the project ships first-party plugins for Claude Code, OpenCode, OpenClaw, and the Nous Research Hermes agent, all open source and built on the same underlying API.
Why This Matters for the AI-Native Stack
The memory problem in AI apps is well-understood but surprisingly painful to solve well: naive vector search misses temporal context, user preferences drift, and knowledge bases go stale. Supermemory's benchmark claims suggest the engine handles these cases at a research level, while the API design makes it accessible without deep ML infra experience.
The open-source licensing and self-hostable architecture address a real concern: memory is deeply personal data, and many teams are reluctant to pipe user context through a third-party black box indefinitely. With the source on GitHub and a clean API contract, developers can run it on their own infrastructure or swap it out without rewriting application logic — which is exactly the kind of escape hatch that earns trust in production systems.
Discussion 0
Join the discussion
Sign in with GitHub to comment and vote.
No comments yet
Be the first to weigh in.