PRODUCT March 20, 2026 · 6 min read

Memory Decay & TTL in AI Systems: Why Memories Should Expire

Infinite memory sounds like a feature. In practice, it's a liability. Stale memories inject outdated context, eat into quota, and — in regulated industries — create compliance risks. Here's how to implement graceful memory decay in your AI agents.

The problem with permanent memory

Imagine a customer support agent that remembered, six months ago, that a user "prefers SMS over email." Since then, the user updated their account to email-only. The agent still injects the old preference — causing confusion and lost trust.

Or a sales agent that stored "prospect is evaluating Competitor X" from a call nine months ago. That deal closed, the competitor was replaced, and the context is now actively misleading.

Memory quality degrades over time. Facts change. User preferences evolve. Procedural memories become outdated when workflows change. Without a decay mechanism, your agent's context layer fills up with confidently retrieved but factually wrong information.

Three approaches to memory expiry

1. Hard TTL (time-to-live)

The simplest approach: set an expiry timestamp at memory creation time. When the timestamp passes, the memory is excluded from recalls or deleted.

Use hard TTL for:

Session-scoped memories (e.g., "user is currently browsing the pricing page")
Time-sensitive facts ("the promotion ends on March 31st")
Ephemeral state that should never persist beyond a session

# Store with explicit TTL (seconds)
curl -X POST "https://api.kronvex.io/api/v1/agents/support-agent/remember" \
  -H "X-API-Key: kv-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User is currently in the checkout flow for the Pro plan",
    "memory_type": "episodic",
    "ttl": 3600,
    "metadata": {"user_id": "user-123", "session_id": "sess-abc"}
  }'

After 3600 seconds (1 hour), this memory will not be returned in any recall — even if semantically relevant. The agent won't mistakenly inject checkout context from a session that ended hours ago.

2. Confidence-based soft decay

Kronvex's confidence scoring already implements soft decay via the recency component: a sigmoid function with a 30-day inflection point. A memory from yesterday has near-maximum recency score; one from 90 days ago scores near zero on the recency axis.

This means old memories don't disappear — they gradually lose weight in the composite confidence score. A 6-month-old memory about "user prefers formal tone" might still get recalled, but only if it's highly semantically relevant (similarity score compensating for low recency). If there's a newer, more relevant memory, the newer one wins.

This is appropriate for semantic and procedural memories that might still be valid but should yield to fresher information.

3. Explicit invalidation

For memories you know have become stale — because you received a signal indicating a change — explicit deletion is the right approach:

import kronvex

client = kronvex.Kronvex("kv-your-api-key")
agent = client.agent("sales-agent")

# User updated their communication preference in your CRM
# Retrieve and delete the old memory
old_memories = agent.recall(
    query="communication preference",
    metadata_filter={"user_id": "prospect-abc"},
    top_k=3
)

for memory in old_memories.results:
    if "email" in memory.content or "sms" in memory.content:
        agent.delete_memory(memory.id)

# Store the new preference
agent.remember(
    "User prefers async Slack messages over email",
    memory_type="semantic",
    metadata={"user_id": "prospect-abc", "category": "communication"}
)

TTL strategy by memory type

Not all memories decay at the same rate. Here's a practical framework:

Episodic (specific events) — Short TTL: 24h to 7 days. "User completed onboarding", "last support ticket was about billing" — useful briefly, stale quickly.
Semantic (facts about the world/user) — Medium TTL: 30 to 90 days. "User works in fintech", "prospect's team is 12 people" — stable but should refresh periodically.
Procedural (how to behave) — Long TTL or no TTL. "Always use bullet points with this user", "escalate immediately if they mention churn" — these are behavioral guidelines that rarely go stale.

GDPR note: Under GDPR, the right to erasure (Article 17) requires you to delete personal data on request. TTL-based expiry is not a substitute for erasure — expired memories may still exist in your DB until a cleanup job runs. Implement an explicit delete endpoint and honor erasure requests immediately. Kronvex's DELETE /api/v1/agents/{id}/memories/{memory_id} endpoint handles this.

Implementing a background pruning job

For memories without explicit TTL, you may want a periodic quality sweep. This Python snippet queries memories older than a threshold with low access_count and removes them:

import kronvex
from datetime import datetime, timedelta

async def prune_stale_memories(agent_id: str, days_old: int = 90, max_access: int = 1):
    """Remove memories older than days_old that were never recalled."""
    client = kronvex.AsyncKronvex("kv-your-api-key")

    async with client:
        agent = client.agent(agent_id)

        # List all memories for the agent
        memories = await agent.list_memories(
            created_before=(datetime.utcnow() - timedelta(days=days_old)).isoformat(),
            max_access_count=max_access
        )

        pruned = 0
        for memory in memories.results:
            # Skip procedural memories — they're long-lived by design
            if memory.memory_type == "procedural":
                continue
            await agent.delete_memory(memory.id)
            pruned += 1

    print(f"Pruned {pruned} stale memories from agent {agent_id}")
    return pruned

The memory quota argument

Beyond data quality, decay and pruning serve a practical purpose: quota management. Kronvex plans have memory limits (10k on Starter, 25k on Pro). Running without any expiry policy means you'll hit quota faster and fill it with increasingly low-quality memories.

A disciplined decay strategy lets you operate well within quota while maintaining high memory quality. The goal isn't to store everything — it's to store what's still useful.

Rule of thumb: If a memory hasn't been recalled in 60 days and has an access_count of 0 or 1, it's almost certainly noise. Prune it. The memories that matter will have been recalled multiple times by then.