1. The file-based memory trap
Two kinds of posts keep going viral in AI developer communities. The first: someone sharing their elaborate Obsidian vault with /resume, /compress, and /preserve skills that hand off context between Claude Code sessions. The second: a developer showing their session-34 handoff file, one of 43 markdown documents carefully maintained to simulate continuity across conversations.
The ingenuity is real. Developers are building genuine systems here — numbered sessions, archive files, frontmatter metadata, bash scripts that auto-compress context at session end. These aren't hacks. They're engineering solutions to a real infrastructure problem.
The instinct is exactly right: AI agents need memory. Without it, every session starts from zero. You re-explain your stack, re-establish conventions, re-share the decisions you made three sprints ago. At 30 minutes of context-rebuilding per day, that's 2.5 hours a week — per developer.
But the implementation doesn't scale. File-based memory works for one person on one project. The moment you add a second project, a second developer, or want to ship a product where your users' agents need memory — the markdown approach collapses.
If you've ever built a /resume skill, a CLAUDE.md file longer than 50 lines, or an Obsidian vault for your AI agent — this article is for you.
2. Why CLAUDE.md breaks at scale
CLAUDE.md is read every session in full. It is a static context injection file — not a retrieval system. When Claude Code starts, the entire contents of CLAUDE.md are prepended to the context window before you type a single word. This is useful for stable project configuration, but it was never designed to be a memory layer.
Here's what breaks in practice:
- No semantic search: you can't query "what did we decide about auth?" — the whole file is injected regardless of relevance to the current task
- No growth without manual editing: someone has to maintain it — new decisions don't get added automatically
- Context window cost: a 300-line CLAUDE.md wastes tokens on irrelevant info every single call, across every session
- Single-user by design: 1 file = 1 agent = 0 multi-tenant — you cannot use this pattern to give each of your product's users their own memory
- No recency: a decision from 6 months ago weighs exactly the same as one from yesterday — there is no temporal dimension
| Feature | CLAUDE.md | Kronvex |
|---|---|---|
| Retrieval | Full file injected | Semantic search (cosine similarity) |
| Growth | Manual edits only | Auto-stores via API |
| Search | None | pgvector similarity + recency + frequency |
| Multi-agent | No | Yes |
| Multi-user | No | Yes, isolated per API key |
| Context cost | Always full file | Only relevant memories |
| Scale | 1 developer, 1 project | Unlimited agents, unlimited users |
3. The Obsidian approach — clever but limited
The Obsidian + Claude Code setups that circulate online are genuinely impressive. CLAUDE.md pointing to vault files, frontmatter for session metadata, /resume to load the last handoff, /compress to summarize at session end, /preserve to archive the important bits. Some developers have 10, 20, 30+ session files managed with discipline most teams don't apply to their actual codebase.
These systems are impressive engineering — but they're solving an infrastructure problem with a note-taking tool.
The hard ceiling these approaches hit:
- Requires local file access: the agent must have filesystem access to the vault — this is fine for a local dev tool, impossible for a cloud-deployed agent
- No multi-user: 1 vault = 1 person = impossible to ship as a product where each user gets their own persistent memory
- Keyword-only search: Obsidian's native search is not semantic — "auth token" won't find a note about "JWT expiry" unless the words match exactly
- Brittle: handoff notes, numbered sessions, archive files — all require manual maintenance that breaks the moment you miss a step
- Works for "my personal workflow" — fails for "my product's users": if you're building an AI agent for others to use, your users can't maintain an Obsidian vault on your behalf
4. What real persistent memory looks like
Three API calls. No file system. No manual maintenance. Works for 1 user or 10,000.
import httpx
# Store after every meaningful interaction
httpx.post("https://api.kronvex.io/api/v1/agents/my-agent/remember",
headers={"X-API-Key": "kv-your-key"},
json={"content": "User prefers Python over Node. Always suggest async/await."})
# Recall before LLM call
r = httpx.post("https://api.kronvex.io/api/v1/agents/my-agent/recall",
headers={"X-API-Key": "kv-your-key"},
json={"query": "language preferences", "top_k": 5})
# Or inject formatted context block
ctx = httpx.post("https://api.kronvex.io/api/v1/agents/my-agent/inject-context",
headers={"X-API-Key": "kv-your-key"},
json={"query": "current session", "max_tokens": 600})
Under the hood, remember embeds your content with text-embedding-3-small and stores it in PostgreSQL + pgvector (EU-hosted). recall runs a cosine similarity search and ranks results by a composite confidence score:
confidence = similarity × 0.6 + recency × 0.2 + frequency × 0.2
Recency uses a sigmoid with a 30-day inflection point. Frequency is log-scaled by access count. This means the most relevant, most recently-confirmed, and most-frequently-accessed memories surface first — while stale or rarely-touched memories fade without being deleted. Your agent gets the right context, not the most recent or the largest.
5. Setting up Kronvex for Claude Code (10 min)
1 Get a free API key
Sign up at kronvex.io/auth. The demo tier gives you 1 agent and 100 memories — no credit card required. Your key starts with kv-.
2 Set up the MCP server or call the REST API directly
The fastest integration for Claude Code is via MCP — see the full MCP guide for a step-by-step walkthrough. You can also call the REST API directly from any tool, script, or post-session hook.
3 Update your CLAUDE.md to bootstrap the workflow
Your CLAUDE.md becomes a lightweight bootstrap instruction instead of a growing knowledge dump:
# Memory
Before each session, call inject-context with the current task.
After each session, store key decisions with remember.
API key: kv-... (from env KRONVEX_API_KEY)
Agent ID: my-project
The free demo tier gives you 100 memories — enough to migrate your existing CLAUDE.md content and test the workflow end-to-end.
6. Migration guide: markdown → API
If you have an existing CLAUDE.md or Obsidian vault, here's a script that reads your markdown and bulk-imports it into Kronvex. Each header-delimited section becomes a separate memory — small enough to retrieve precisely, large enough to carry meaning.
import httpx, re
# Parse your CLAUDE.md into logical chunks
with open("CLAUDE.md") as f:
content = f.read()
# Split by headers or paragraphs
chunks = [c.strip() for c in re.split(r'\n#{1,3} ', content) if len(c.strip()) > 50]
# Bulk import
for chunk in chunks:
httpx.post(
"https://api.kronvex.io/api/v1/agents/my-project/remember",
headers={"X-API-Key": "kv-your-key"},
json={"content": chunk}
)
print(f"Stored: {chunk[:60]}...")
Run this once. Your CLAUDE.md can now shrink to 10 lines. The agent will retrieve the relevant sections semantically — only what matters for the current query, ranked by recency and frequency. No more full-file injection, no more token waste.
7. What to store vs what not to store
Not everything deserves a persistent memory. The rule: store anything you'd have to repeat across sessions. Don't store things that belong to a single conversation.
✅ Good to store
- Architecture decisions: "We use PostgreSQL, not MySQL — decided March 2026" — the what and the why
- Coding conventions: "All React components are functional, no class components"
- User preferences: "User Baptiste prefers concise responses, no preamble"
- Recurring patterns: "Database migrations use Alembic, always test on staging first"
- Resolved bugs: "Auth token expires in 15min not 1h — fixed 2026-03-15"
❌ Don't store
- Ephemeral session context: current file contents, temporary variables — this belongs in conversation history
- Large raw code blocks: use RAG/vector search over your codebase for that, not memory entries
- Secrets, API keys, passwords: use environment variables, never memory
- Highly volatile data: anything that changes every session — store the pattern, not the instance
Replace your CLAUDE.md with real memory
100 free memories. No credit card. Works with Claude Code, Cursor, and any HTTP client.
Get Your Free API Key →