What is AI agent memory?

AI agent memory is persistent storage that lets an AI agent recall information from previous conversations. Unlike the LLM context window (which resets after each session), agent memory survives session boundaries — enabling agents to remember user preferences, past interactions, and accumulated facts. Kronvex provides agent memory via three REST endpoints.

What is the difference between agent memory and RAG?

RAG retrieves documents from a knowledge base at query time. Agent memory stores and recalls conversation context and user-specific information across sessions. RAG answers 'what does the documentation say?' — agent memory answers 'what do I know about this specific user?'. Most production agents need both.

How do I add persistent memory to a Python AI agent?

With Kronvex, three API calls are all you need: POST /remember to store a memory after each interaction, GET /recall to retrieve relevant memories by semantic search before generating a response, and POST /inject-context to build a ready-to-use context string. No SDK required — works with any HTTP client.

How much does Kronvex cost?

Kronvex starts free forever: 3 agents, 500 memories, no credit card required. Paid plans start at €29/month (Builder: 5 agents, 20k memories). See kronvex.io/pricing for all plans.

Is Kronvex GDPR compliant?

Yes. All data is stored exclusively in the EU (Frankfurt, Germany via Supabase). Kronvex includes a GDPR Art. 17 right-to-erasure endpoint, a standard DPA available at kronvex.io/dpa, and an audit log on paid plans. No data ever leaves the EU.

AI Agent Memory: Complete Guide

The amnesia problem

You've built an AI agent. It's smart, fast, and well-prompted. But there's a problem you can't prompt your way out of: it forgets everything between sessions.

Every conversation starts cold. The user has to re-explain who they are, what they're working on, how they like to communicate. Your agent — no matter how capable — behaves like someone with no long-term memory. It's technically impressive and experientially broken.

Here's what this looks like in practice:

👤

USER · SESSION 1

I'm a CTO at a Series B fintech. We're evaluating AI tooling for our support team. I prefer concise answers, no fluff.

⬡

AGENT WITHOUT MEMORY · SESSION 2 (next day)

Hi! I'm an AI assistant. How can I help you today? Could you tell me a bit about your role and what you're looking for?

This isn't a model quality problem. It's an infrastructure problem. The model is fine — it just has no access to anything that happened before.

Why stuffing context into every prompt doesn't scale

The obvious workaround: dump everything into the system prompt. Load the user's profile, past conversations, preferences, history. Let the LLM figure it out.

This works at prototype scale. It breaks in production for three reasons:

Cost. At GPT-4 pricing, a 50k-token context per call across thousands of users gets expensive fast. You're paying to process the same information on every single request.
Latency. Large contexts mean slower time-to-first-token. Users notice anything above 2–3 seconds. A 50k-token context won't help.
Relevance. Not all history is relevant to the current message. Dumping everything forces the model to attend to noise — which degrades quality, not improves it.

ℹ

The right approach isn't more context — it's relevant context. Semantic search lets you retrieve only the memories that actually matter for the current query.

What memory-enabled agents actually do differently

A memory-enabled agent doesn't need the user to re-explain themselves. Here's the same interaction with Kronvex:

👤

USER · SESSION 2 (next day)

What are the best tools for our support team evaluation?

⬡

AGENT WITH KRONVEX · context injected automatically

For a Series B fintech support team, three tools stand out: Intercom for scale, Plain for developer-first support, Zendesk if you need enterprise audit trails. I'll keep it brief — here's the key tradeoff for each: ...

The agent remembered the user is a CTO, in fintech, evaluating for their support team, and prefers concise answers. It didn't ask. It just knew.

The three memory types that matter

Not all memory is the same. Kronvex structures memory into three types that map to how humans actually remember things:

Semantic memory — facts about the user or world that stay true over time. "User is a CTO at fintech." "Budget is €50k." "Prefers bullet points."
Episodic memory — events and interactions. "On March 3 the user asked about pricing. On March 5 they requested a demo." This is your conversation history, structured.
Procedural memory — how the user wants things done. "Always use numbered steps." "Don't ask follow-up questions." "Reply in French." These are persistent behavioral preferences.

Using the right type improves recall precision. If a user asks "what do I usually work on?", you want semantic. If they ask "what did we discuss last week?", you want episodic. If you want to auto-apply their formatting preferences at session start, fetch procedural.

How to add memory in 3 API calls

Here's a complete working example. This is all you need to get from amnesiac to context-aware.

PYTHON

import requests

API_KEY  = "kx_live_your_key"
BASE     = "https://api.kronvex.io"
HEADERS  = {"X-API-Key": API_KEY, "Content-Type": "application/json"}

# ── STEP 1: Create an agent (once, store the ID) ───────────
agent = requests.post(f"{BASE}/agents", headers=HEADERS,
    json={"name": "support-bot"}).json()
agent_id = agent["id"]

# ── STEP 2: Store what you learn about the user ────────────
# After the first session, store key facts
requests.post(f"{BASE}/agents/{agent_id}/remember", headers=HEADERS,
    json={
        "content": "CTO at Series B fintech. Evaluating support tooling. Prefers concise answers.",
        "memory_type": "semantic",
        "session_id": f"user_{user_id}"
    })

# ── STEP 3: Inject context before every LLM call ──────────
def chat(user_message):
    ctx = requests.post(f"{BASE}/agents/{agent_id}/inject-context",
        headers=HEADERS,
        json={"message": user_message, "top_k": 5}
    ).json()

    system = ctx["context_block"] + "\n\nYou are a helpful assistant."
    # → Your LLM call with system + user_message
    return call_llm(system, user_message)

That's it. Three calls. Your agent now accumulates context across every session, per user, and retrieves only what's relevant at <80ms average latency.

What about RAG? Isn't that the same thing?

RAG (Retrieval-Augmented Generation) and agent memory solve different problems. RAG is about retrieving from a shared knowledge base — your docs, your product catalogue, your FAQ. It's the same for all users.

Agent memory is about per-user context — what this specific user told you, how they like to work, what happened in their past sessions. It's different for every user and grows over time.

Most production AI products need both. Kronvex handles the per-user memory layer. Your vector DB or search handles the shared knowledge base.

The trust argument

Beyond the technical benefits, there's a product argument that often gets missed: memory is how users develop trust in an AI tool.

When a tool remembers you, it signals that your time and context matter. When it forgets you, it signals the opposite. Users form long-term relationships with tools that demonstrate continuity. They churn from tools that feel like they're talking to a stranger every time.

In B2B, this matters even more. A sales rep who has to re-brief an AI tool every morning won't use it for long. A support agent that starts every ticket cold will always feel less useful than a human who remembers the customer.

★

The killer endpoint is /inject-context. Instead of calling /recall and formatting results yourself, it returns a formatted string you drop directly into your system prompt. One call before every LLM request.

Real-world use cases

Customer support bots

A support bot with memory remembers previous tickets, the customer's subscription plan, recurring issues, and their preferred communication style. The result: faster resolution, less frustration, higher NPS. Store resolved tickets as episodic with a 180-day TTL, and account information as semantic with no expiration.

Personal AI assistants

A personal AI assistant compounds value on every interaction. It knows you prefer bullet-point summaries, that you're working on a specific project, and that you have an important meeting on Friday. Use semantic for long-lived preferences, episodic for recent tasks and events, and procedural for recurring workflows the assistant should automate.

AI sales agents

A sales agent with memory knows the full conversation history with each prospect, their past objections, which demos were already run, and where the deal is in the pipeline. It can pick up exactly where the last conversation ended — like an experienced human sales rep. episodic memory tracks interactions; semantic memory encodes the prospect's profile and identified needs.

ℹ

GDPR note: If you store personal data from EU users, make sure your memory store is hosted in the EU and complies with portability and deletion obligations. Kronvex is hosted in Europe (Frankfurt) and provides a deletion API scoped by session_id to implement the right to erasure.

Choosing the right memory architecture

Agent memory is not a binary problem. The right architecture combines several approaches depending on your needs:

Short-term context — LLM context window (last N messages)
Static knowledge — RAG on a document base (docs, FAQ, catalogue)
Style and domain — base model fine-tuning
Per-user personalised memory — dedicated memory API (Kronvex)

For the vast majority of production use cases in 2026 — support agents, personal assistants, AI salespeople — the winning combination is: a strong base LLM + persistent memory API. Fine-tuning and RAG come as complements, not replacements.

Memory is no longer optional. It is the baseline infrastructure for any AI agent that aims to be genuinely useful over time.

Getting started

Kronvex runs on three endpoints: /remember, /recall, /inject-context. One API key. EU-hosted. Free demo plan available.

The full API reference is at docs.kronvex.io. Or start with the use cases page to see how teams in sales, support, onboarding and dev tooling are implementing this today.

Your agents are capable enough. Give them a memory worth keeping.

Why Your AI Agent Needs Memory (And How to Add It in 3 API Calls)

The amnesia problem

Why stuffing context into every prompt doesn't scale

What memory-enabled agents actually do differently

The three memory types that matter

How to add memory in 3 API calls

What about RAG? Isn't that the same thing?

The trust argument

Real-world use cases

Customer support bots

Personal AI assistants

AI sales agents

Choosing the right memory architecture

Getting started

Frequently Asked Questions

Give your agent memory today