What happened to Zep?
Zep was one of the most popular open-source memory layers for AI agents, accumulating over 24,000 GitHub stars and becoming the go-to solution for developers building agents that needed to remember past conversations. It was self-hostable, open-source, and free — which made it extremely attractive for indie developers and startups.
In early 2026, that changed. The Zep team deprecated their Community Edition, pushing all users toward Zep Cloud. Their GitHub now reads: "Zep Community Edition is no longer actively maintained." The pricing on Zep Cloud starts at $25/month for basic usage and jumps to $475/month for production-grade access with higher throughput and SLA guarantees.
For teams that built their memory infrastructure on the free self-hosted edition, this is a forced migration. And given the switching cost involved in ripping out a foundational component like a memory layer, it's worth evaluating every viable alternative before committing.
If you're on Zep Community Edition: The repository is still accessible and you can continue running your existing deployment, but you won't receive security patches, bug fixes, or compatibility updates as LLM APIs evolve. Plan your migration within 6–12 months.
What you need from a memory layer
Before choosing an alternative, it's worth being explicit about what "memory for AI agents" actually requires. Not all memory solutions are equivalent, and the trade-offs matter depending on your use case.
- Semantic search — retrieve memories by meaning, not just keyword matching. A user who said "I hate writing boilerplate" and one who said "I prefer generated scaffolding" are expressing the same preference; your memory layer should surface both for the query "coding preferences."
- Multi-session persistence — memories must survive across conversations, browser sessions, and deployments. This rules out in-memory solutions and requires a durable data store.
- Low latency — recall should complete in under 100ms end-to-end. Memory retrieval happens on every agent turn; any latency here compounds across a full conversation.
- GDPR compliance — if you're serving EU users, your memory layer is processing personal data under GDPR. EU data residency, right to erasure, and data portability are mandatory, not optional.
- Multi-tenant architecture — if you're building a B2B product where each of your customers has their own users, you need strict memory isolation between tenants. Cross-tenant memory leaks are a critical security issue.
- Cost efficiency — some memory solutions run LLM calls on every write (to extract and consolidate memories). This can add significant hidden cost at scale. Know what you're paying per operation.
The Zep CE gap: Zep Community Edition met most of these requirements at zero cost. Any replacement needs to either match that bar or offer a clear trade-off that justifies the change.
The alternatives compared
Option 1: Kronvex Recommended for EU B2B
Kronvex is a hosted persistent memory API built specifically for B2B AI agents. It exposes three endpoints — remember, recall, and inject-context — and handles embedding, storage, and retrieval internally. No infrastructure to manage.
- EU-hosted on Supabase Frankfurt — GDPR-native from day one
- pgvector cosine similarity at recall time, no LLM in the retrieval path
- Confidence scoring:
similarity × 0.6 + recency × 0.2 + frequency × 0.2— recent and frequently-accessed memories surface higher - Multi-tenant by design — each agent is scoped to a key, memories are isolated by agent ID
- Free tier: 1 agent, 100 memories — enough to prototype
- Builder: €29/month — 20,000 memories, 5 agents
Migration from Zep is a direct swap. The concepts map cleanly: Zep session_id becomes a Kronvex agent ID, memory.add() becomes agent.remember(), and memory.search() becomes agent.recall().
# BEFORE: Zep
from zep_python import ZepClient, Memory, Message
client = ZepClient(api_key="YOUR_ZEP_KEY")
client.memory.add(
session_id="session_123",
memory=Memory(messages=[Message(role="user", content="I prefer Python")])
)
results = client.memory.search("session_123", "language preference")
# AFTER: Kronvex
from kronvex import Kronvex
client = Kronvex(api_key="kv-YOUR_KEY")
agent = client.agent("user_123")
agent.remember("user prefers Python")
results = agent.recall("language preference")
# Results include confidence scores (similarity + recency + frequency)
// BEFORE: Zep
const zep = new ZepClient({ apiKey: "YOUR_ZEP_KEY" });
await zep.memory.add("session_123", {
messages: [{ role: "user", content: "I prefer Python" }]
});
// AFTER: Kronvex
import { Kronvex } from 'kronvex';
const client = new Kronvex('kv-YOUR_KEY');
const agent = client.agent('user_123');
await agent.remember('user prefers Python');
const results = await agent.recall('language preference');
Best for: EU B2B companies, multi-tenant SaaS architectures, developers who want a simple hosted API with no infrastructure overhead.
Option 2: Mem0
Mem0 is the largest player in the dedicated AI memory space, with over 50,000 GitHub stars and a $24M Series A behind it. It has an API-first approach similar to Kronvex and a large developer community.
- API-first, SDKs for Python and Node.js
- Intelligent memory consolidation — LLM runs on every write to extract and deduplicate memories
- US-hosted; no EU data residency option currently available
- Pricing requires login to view (a common friction point for evaluation)
- LLM-in-the-write-path adds cost per memory stored, not just per API call
GDPR note: If you're building for EU users, Mem0's US-only hosting creates a data transfer compliance issue under GDPR Chapter V. Standard Contractual Clauses can mitigate this, but it adds legal overhead that EU-hosted alternatives avoid entirely.
Best for: US-based projects, developers already deep in the OpenAI ecosystem, teams that prioritize memory quality over cost predictability at the write level.
Option 3: Letta (formerly MemGPT)
Letta started as a research project (MemGPT) and has grown into a full stateful agent framework, not just a memory layer. It has 21,000+ GitHub stars and strong roots in the academic AI agent community.
- Full stateful agent framework — you get a whole agent runtime, not just memory
- Self-hosted via Docker — full data sovereignty, but requires infrastructure management
- Significantly more complex to set up and operate than a pure memory API
- Not designed for multi-tenant architectures out of the box
- Active development, research-quality features (memory editing, archival memory)
Best for: Research projects, developers who want full control over the agent runtime and don't mind operational complexity, teams experimenting with advanced memory architectures.
Option 4: DIY with pgvector
Rolling your own memory layer on PostgreSQL + pgvector is entirely viable if you have the time and infrastructure expertise. pgvector is a mature PostgreSQL extension for vector similarity search, and OpenAI's embedding API (or a local model) provides the embedding pipeline.
- Full ownership and control — no vendor lock-in
- Self-hosted, so data residency is wherever you deploy
- Roughly 2–3 weeks to build properly: embedding pipeline, similarity search, confidence scoring, session management, expiry logic, multi-tenant isolation
- Ongoing maintenance burden: pgvector updates, embedding model changes, index tuning
- No managed service SLA — you own the uptime
Best for: Teams with dedicated infra engineers who want full ownership, organizations with strict data residency requirements that rule out all third-party APIs, projects where the memory layer is a core differentiator worth owning.
Migration checklist: Zep → Kronvex
If you've decided to migrate from Zep Community Edition to Kronvex, here's the complete migration checklist. The API surface area maps cleanly, so most migrations complete in a few hours of work.
- Sign up at kronvex.io and retrieve your API key (
kv-...) - Install the SDK:
pip install kronvexornpm install kronvex - Map your Zep
session_idvalues to Kronvex agent IDs (typically your user IDs) - Replace
memory.add()calls withagent.remember(text) - Replace
memory.search()calls withagent.recall(query) - Update context injection: use
agent.inject_context()instead of manually passing raw history - Test on your staging environment with a representative set of queries
- Export your Zep memories before shutting down your CE instance (Zep provides a data export feature in the dashboard)
Memory export from Zep CE: Before decommissioning your Zep instance, use the /api/v2/sessions/{sessionId}/memory endpoint to export all memories per session. Kronvex's remember() endpoint accepts plain text, so you can replay your exported memories directly during migration.
Performance comparison
Architecture differences translate directly into latency and cost profiles. Here's how the main options compare across the metrics that matter most at production scale.
| Metric | Kronvex | Zep Cloud | Mem0 | DIY pgvector |
|---|---|---|---|---|
| Recall latency (p50) | ~45ms | ~80ms | ~60ms | ~30ms (self-hosted) |
| Write latency (p50) | ~200ms | ~300ms | ~400ms | ~150ms |
| Cost per 100K recalls | Included in plan | Included in plan | Included in plan | ~$0 (infra cost) |
| Cost per 100K writes | Included in plan | Included in plan | ~$200 (LLM calls) | ~$20 (embeddings only) |
| EU data residency | ✓ | ✗ | ✗ | ✓ (self-hosted) |
| Zero infrastructure | ✓ | ✓ | ✓ | ✗ |
| GDPR-ready out of the box | ✓ | ✗ | ✗ | DIY required |
Note: latencies are estimates based on public benchmarks and architecture analysis. Actual numbers depend on network topology, payload size, and plan tier. Self-hosted DIY latency assumes co-located deployment.
The most significant cost difference is at the write level. Mem0 runs an LLM on every write to extract structured memories from raw text. This produces higher-quality memory consolidation, but at $0.002 per write (approximate), 100,000 writes cost $200 in LLM API fees on top of the platform subscription. Kronvex uses OpenAI embeddings only (text-embedding-3-small) and amortizes that cost into the flat plan pricing, so your write cost is fully predictable.
Conclusion: which one should you pick?
The right answer depends on three variables: where your users are, how much operational complexity you can absorb, and whether cost predictability matters to your business.
- EU B2B product, simple API preference → Kronvex. EU data residency is built in, GDPR compliance is handled at the infrastructure level, and the API surface is minimal enough to migrate in an afternoon. The flat pricing model means no surprise bills as you scale writes.
- US-based project, maximum ecosystem maturity → Mem0. Largest community, most integrations, best documentation. The LLM-on-write approach produces genuinely high-quality memory consolidation if write volume is manageable.
- Research project, want full runtime control → Letta. If you need to inspect and edit individual memories, experiment with archival memory hierarchies, or build a custom agent loop, Letta gives you the most control.
- In-house infra team, want zero vendor dependency → DIY pgvector. Takes 2–3 weeks to build properly and requires ongoing maintenance, but the result is a memory layer you fully own with no external API dependencies.
The Zep deprecation is disruptive, but the alternatives have matured significantly. If you were self-hosting Zep CE primarily to avoid paying $475/month for Zep Cloud, the good news is that the competitive options are considerably more affordable — and in Kronvex's case, EU-compliant from the first line of code.