LIVE DEMO → Home Product
Features Use Cases Compare Enterprise
Docs
Documentation Quickstart MCP Server Integrations Benchmark
Pricing Blog DASHBOARD → LOG IN →
Performance Benchmarks

Built for speed.
Proven in production.

Real numbers from Kronvex's pgvector infrastructure — latency, precision, and throughput at scale.

pgvector HNSW indexing EU Frankfurt · Supabase eu-central-1 1536-dim OpenAI embeddings Cosine similarity search

Recall accuracy

Measured on a public dataset of 200 query/memory pairs across 4 types. Methodology & dataset →

87%
Recall@1
top result is correct
94%
Recall@3
correct in top 3
38ms
P50 latency
median recall
52ms
P95 latency
95th percentile

Last run: 2026-04-06 · Dataset: 200 pairs · Open source methodology

<45ms
Recall p50
TOP-5 MEMORIES
<55ms
inject_context p50
FULL CONTEXT PROMPT
<120ms
remember p50
WRITE + EMBED
1536
Embedding dims
TEXT-EMBEDDING-3-SMALL

End-to-end latency by operation

Measured from HTTP request received at Railway (EU West) to response sent. Includes Supabase query round-trip and embedding lookup where applicable. p50 / p95 / p99 over rolling 7-day window.

Operation p50 (median) p95 p99 Notes
POST /recall
Top-5 memories, cosine similarity
<45ms
<140ms
<280ms
HNSW index scan + confidence scoring
POST /inject-context
Recall + prompt assembly
<55ms
<160ms
<320ms
Includes system prompt injection overhead
POST /remember
Write + embed + store
<120ms
<380ms
<700ms
OpenAI embedding call is the dominant cost
GET /agents
List all agents for API key
<30ms
<90ms
<180ms
Simple indexed lookup, no embedding
Supabase eu-central-1 (Frankfurt) pgvector HNSW · m=16 · ef_construction=64 1536-dim text-embedding-3-small Railway EU West · async Python

Confidence scoring — not just similarity

Raw cosine similarity misses two signals that matter for agents: how recent a memory is, and how often it's been accessed. Kronvex combines all three into a single composite score.

confidence = similarity × 0.6 + recency × 0.2 + frequency × 0.2
Similarity — cosine distance in 1536-dim space (weight: 60%)
Recency — sigmoid decay with 30-day inflection point (weight: 20%)
Frequency — log-scaled access count from agent history (weight: 20%)
1
"User prefers annual billing and pays by SEPA direct debit."
Similarity 0.91
Recency (2d ago) 0.96
Frequency (12×) 0.82
Confidence 0.910
2
"User asked about invoice history on March 12th."
Similarity 0.78
Recency (18d ago) 0.62
Frequency (3×) 0.48
Confidence 0.690
3
"User mentioned they work in finance during onboarding."
Similarity 0.71
Recency (45d ago) 0.38
Frequency (1×) 0.20
Confidence 0.550

Memory #1 wins even though #2 might have slightly higher semantic match — because it is recent and frequently accessed.

Three memory types, one API

Most memory APIs store a single type of fact. Kronvex handles semantic, episodic, and procedural memory with the same three endpoints.

Memory type Examples Kronvex Mem0 Zep
Semantic Facts, preferences, entity attributes ✓ Native ✓ Native ✓ Native
Episodic Past events, sessions, conversation history ✓ Native ~ Via SDK ✓ Native
Procedural Workflows, rules, operating instructions ✓ Native ✗ Not supported ~ Manual tagging
Conflict handling Detect and resolve contradictory memories ✓ Confidence-based ~ Graph dedup ✗ Manual
GDPR erasure DELETE /agents/{id} — wipes all memories ✓ One API call ~ Enterprise tier ~ Enterprise tier

Kronvex vs Mem0 vs Zep

Factual comparison based on publicly available documentation (April 2026). Always verify current pricing directly with each vendor.

Feature Kronvex Mem0 Zep
Recall latency (p50) <45ms ~50–100ms ~40–120ms
Memory types ✓ Semantic + Episodic + Procedural Semantic + Episodic Semantic + Episodic
EU data hosting ✓ Frankfurt (default, always) ✗ US by default ~ Self-host or Enterprise
API surface REST API + Python & Node SDK ~12 SDK methods REST + GraphQL
Pricing model Flat monthly plans from €29 Usage tiers + overages Usage-based + self-host
Conflict handling ✓ Confidence-based ranking ~ Graph deduplication ✗ Manual resolution
GDPR erasure API ✓ DELETE /agents/{id} ~ Enterprise tier ~ Enterprise / self-host
SDK required ✓ No — plain HTTP ✗ SDK strongly recommended ~ SDK or REST
Embedding model text-embedding-3-small (1536d) Configurable Configurable

What makes it fast

Every architectural decision was made with latency and data sovereignty in mind.

DATABASE

Supabase eu-central-1

PostgreSQL 15 hosted in Frankfurt, Germany. All memory data never leaves the EU. Async connection pool with asyncpg for maximum throughput.

INDEX

pgvector HNSW

Hierarchical Navigable Small World indexing over 1536-dimensional embeddings. Approximate nearest-neighbor search with sub-50ms recall at production scale.

EMBEDDINGS

OpenAI text-embedding-3-small

1536-dimensional dense embeddings via OpenAI's most cost-efficient embedding model. Semantic search quality that outperforms keyword matching for agent memory.

RUNTIME

FastAPI on Railway EU

Async Python with uvicorn, deployed on Railway's EU West region. SQLAlchemy async sessions with connection pooling. Cold starts under 2s on container wake.

CDN

Cloudflare Workers Sites

Frontend and static assets served via Cloudflare's global edge network. API responses are not cached — every recall is live from the database, ensuring freshness.

AUTH

SHA-256 key hashing

API keys are SHA-256 hashed at rest — only the hash is stored in the database. Keys are verified in constant time to prevent timing attacks. Supabase handles user auth JWTs.

Don't take our word for it

Get a free demo key and measure recall latency yourself with curl. First memory in under 5 minutes.

recall-benchmark.sh
# 1. Store a memory
curl -X POST https://api.kronvex.io/api/v1/agents/YOUR_AGENT_ID/remember \
  -H "X-API-Key: kv-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers dark mode and pays by credit card."}'

# 2. Recall — measure latency with curl's built-in timer
curl -X POST https://api.kronvex.io/api/v1/agents/YOUR_AGENT_ID/recall \
  -H "X-API-Key: kv-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "payment preferences", "top_k": 5}' \
  -w "\n\nTotal: %{time_total}s — Connect: %{time_connect}s\n"

# 3. inject_context — returns ready-to-use system prompt
curl -X POST https://api.kronvex.io/api/v1/agents/YOUR_AGENT_ID/inject-context \
  -H "X-API-Key: kv-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "payment preferences", "top_k": 5}'
EXPECTED RESULTS

Recall and inject-context typically return in 20–60ms from EU networks. The first /remember call will be slower (~120ms median) because it calls the OpenAI embedding API. Subsequent writes use the same latency profile. Results vary with network proximity to Frankfurt.

Get started

Run your own benchmark in 5 minutes

Free demo key — no credit card, no SDK required. 1 agent, 100 memories, full API access.

Questions? hello@kronvex.io  ·  EU-hosted by default  ·  GDPR-native

Free access
Get your API key

100 free memories. No credit card required.

Already have an account? Sign in →