Chatbot Memory:
How to make your bot remember users
Most chatbots suffer from amnesia. Every session starts at zero — no knowledge of past issues, no remembered preferences, no continuity. This guide covers why bots forget, what they should remember, and how to implement persistent chatbot memory that works across sessions, devices, and time.
Why chatbots forget
HTTP is stateless by design. Every request your chatbot receives is, from the server's perspective, the first request it has ever received. Session state has to be explicitly managed — and most chat implementations simply don't bother.
There are two compounding problems:
- Context window limits: Even if you pass conversation history to your LLM, context windows have a ceiling. A user's 6-month interaction history doesn't fit in 128k tokens — and filling the context with raw history is expensive and noisy.
- No cross-session persistence: Browser sessions end. Tabs close. Users return days later. In-memory state is gone. localStorage is device-specific and easily cleared.
The result: users have to re-explain themselves every time. "As I mentioned last time..." is the sentence your chatbot should never receive.
The 4 types of things bots should remember
Not all information ages at the same rate or has the same value. Effective chatbot memory distinguishes between:
1. User preferences
Communication style (formal/informal), preferred formats (bullet points vs. prose), language, timezone, notification preferences. These are long-term semantic facts. Store them with memory_type="semantic" and no TTL — they rarely change.
2. Past issues and resolutions
A support bot that doesn't remember a user's recurring billing issue is actively harmful — it forces the user to re-explain the problem and the previous resolution. Store incident summaries as episodic memories with a long TTL (90–180 days).
3. Conversation history (selective)
You do not need to store every message. Store the key decisions, commitments, and facts exchanged. A good rule: store the user's message if it contains a new fact or decision; skip pure acknowledgements ("OK", "thanks").
4. Contextual facts
Company name, job title, project names, stack/tech preferences, team size — facts that make every future interaction more relevant. Extract these from conversation using an LLM and store them as semantic memories.
Implementation approaches compared
Simple, zero-infrastructure. But device-specific, easily cleared, not accessible from the server side. Useful only for UI preferences like theme.
Durable, queryable. But keyword search only — "I prefer formal tone" won't match a query for "communication style". Requires schema design for every memory type.
Semantic retrieval, persistence, typed memories, recency scoring. The right tool. Kronvex is this — purpose-built for agent memory rather than generic document retrieval.
Full Python chatbot with Kronvex memory
A production-ready chatbot class that remembers context across sessions. Drop this into any FastAPI or Flask backend:
# async: pip install "kronvex[async]"
import os from openai import OpenAI from kronvex import Kronvex class MemoryBot: """Chatbot with persistent cross-session memory via Kronvex.""" def __init__(self, user_id: str): self.user_id = user_id self.openai = OpenAI(api_key=os.environ["OPENAI_API_KEY"]) kv = Kronvex(os.environ["KRONVEX_API_KEY"]) self.agent = kv.agent(os.environ["KRONVEX_AGENT_ID"]) def chat(self, user_message: str) -> str: # 1. Retrieve relevant memories before the LLM call ctx = self.agent.inject_context( query=user_message, top_k=6, session_id=self.user_id, ) # 2. Build messages with memory context in system prompt system_prompt = """You are a helpful assistant with memory of past interactions. Always address the user by name if known. Do not ask for information already in your context. Relevant memory about this user: {context}""".format(context=ctx.context or "No prior context.") messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_message}, ] # 3. Call LLM response = self.openai.chat.completions.create( model="gpt-4o", messages=messages, ) ai_reply = response.choices[0].message.content # 4. Store this exchange as episodic memory self.agent.remember( content=f"User: {user_message}", memory_type="episodic", session_id=self.user_id, ttl_days=60, ) # 5. Extract and store semantic facts if present self._extract_facts(user_message) return ai_reply def _extract_facts(self, message: str): """Use LLM to extract durable facts from user message.""" extract_prompt = f"""Extract any durable facts about the user from this message. Return a JSON array of strings, one fact per item, or [] if none. Only extract facts like name, company, role, preferences, constraints. Message: "{message}" Facts:""" result = self.openai.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": extract_prompt}], response_format={"type": "json_object"}, max_tokens=200, ) import json try: facts = json.loads(result.choices[0].message.content).get("facts", []) for fact in facts: self.agent.remember( content=fact, memory_type="semantic", session_id=self.user_id, ) except (json.JSONDecodeError, KeyError): pass # Graceful degradation — memory failure doesn't break chat # FastAPI integration from fastapi import FastAPI from pydantic import BaseModel app = FastAPI() class ChatRequest(BaseModel): user_id: str message: str @app.post("/chat") def chat(req: ChatRequest): bot = MemoryBot(req.user_id) reply = bot.chat(req.message) return {"reply": reply}
React/JS frontend — passing session ID
The frontend needs to pass a stable session_id (your user ID) with every request. Here is a minimal React hook and chat component:
import { useState, useCallback } from 'react'; export function useChat(userId) { const [messages, setMessages] = useState([]); const [loading, setLoading] = useState(false); const sendMessage = useCallback(async (text) => { setMessages(m => [...m, { role: 'user', content: text }]); setLoading(true); const res = await fetch('/chat', { method: 'POST', headers: { 'Content-Type': 'application/json' }, // Pass userId as session_id — this is the memory key body: JSON.stringify({ user_id: userId, message: text }), }); const data = await res.json(); setMessages(m => [...m, { role: 'assistant', content: data.reply }]); setLoading(false); }, [userId]); return { messages, loading, sendMessage }; } // Usage: get userId from your auth system // const { user } = useAuth(); // Supabase, Clerk, Auth0... // const { messages, sendMessage } = useChat(user.id);
user_id from your auth system is the memory scope key. The same user on mobile and desktop will share memories because they authenticate with the same ID. Never use a random session UUID — use a stable, authenticated user identifier.
GDPR: consent, right to erasure, data retention
Storing user memories in an EU-hosted vector database triggers GDPR obligations. Here is what you need:
Consent
Memory storage is a legitimate interest under GDPR if it improves service quality and you disclose it. Add a line to your privacy policy: "We store summaries of your conversations to personalise future interactions. You can request deletion at any time." Explicit opt-in is not required for legitimate interest, but clear disclosure is.
Right to erasure
Kronvex provides a DELETE /api/v1/agents/{id}/memories endpoint that accepts a session_id filter. Wire this to your account deletion flow:
import httpx async def delete_user_memories(user_id: str): """Called when user requests account deletion.""" async with httpx.AsyncClient() as client: await client.delete( f"https://api.kronvex.io/api/v1/agents/{AGENT_ID}/memories", headers={"X-API-Key": KRONVEX_API_KEY}, params={"session_id": user_id}, )
Data retention
Use TTL on episodic memories: ttl_days=90 for conversation history, ttl_days=365 for issue resolutions, and no TTL for preferences (delete on account closure). Kronvex runs entirely in the EU (Frankfurt) so there is no cross-border transfer to worry about.
Real-world results: what customers see
Teams that have shipped memory-enabled chatbots using Kronvex report consistent patterns:
- Support deflection up 30–45%: Returning users get answers faster because the bot already knows their setup, plan, and past issues — fewer "let me look that up" responses.
- Conversation length down: Users no longer re-explain context. First-message-to-resolution time drops significantly for repeat queries.
- CSAT up 0.4–0.8 points: Users who feel recognised score their experience higher. "It remembered my name and my project" is a common verbatim.
- Sales conversion up on B2B bots: A sales bot that recalls a prospect's stated budget and timeline from a previous conversation converts 2–3x better than one that starts cold.