LIVE DEMO → Home Product
Features Use Cases Compare Enterprise
Docs
Documentation Quickstart MCP Server Integrations Benchmark
Pricing Blog DASHBOARD → LOG IN →
Deep Dive Architecture pgvector March 22, 2026 · 12 min read

Vector Databases for AI Agents:
pgvector vs Pinecone vs Weaviate

Not all vector databases are equal when the use case is agent memory rather than RAG retrieval. The requirements are different — you need structured metadata, recency weighting, access frequency tracking, and SQL joins. Here's a detailed comparison of the main options.

In this article
  1. Why AI agents need vector search
  2. Comparison: pgvector / Pinecone / Weaviate / Qdrant
  3. Why pgvector wins for agent memory
  4. How Kronvex uses pgvector + confidence scoring
  5. Code example: raw pgvector vs Kronvex API

Why AI agents need vector search

A traditional database answers the question: "Does this record exist?" A vector database answers a different question: "What is most similar to this query?" For agent memory, that distinction is fundamental.

Consider a support agent that has stored thousands of past interactions. When a user asks "I'm having trouble with my invoice", an exact-match search finds nothing useful. But a semantic vector search finds: "User couldn't download their receipt", "billing portal access issue", "payment method update failed" — all relevant, none containing the word "invoice".

This is the core problem that vector search solves for agents: intent matching at recall time, not keyword matching. The user's current query is embedded into the same vector space as stored memories, and cosine similarity finds the nearest neighbors regardless of exact wording.

RAG vs agent memory: RAG (Retrieval-Augmented Generation) retrieves chunks of documents to answer factual questions. Agent memory retrieves past experiences, preferences, and learned facts about a specific entity (user, customer, project). The data model is fundamentally different — agent memory needs typed records, session scoping, and temporal weighting.

Comparison table

Evaluated specifically for the agent memory use case — not generic document retrieval:

Feature pgvector Pinecone Weaviate Qdrant
Fully managed Via Supabase/RDS Yes Cloud + self-host Cloud + self-host
Self-hostable Yes No Yes Yes
EU region available Yes (Frankfurt) EU-West only Yes Yes
SQL joins on metadata Native SQL No GraphQL only Payload filters
ACID transactions Yes No No No
Hybrid search (vector + keyword) With tsvector Yes Yes (BM25) Yes (sparse)
p99 latency (10k vectors) <5ms local ~20ms managed ~15ms managed ~10ms managed
Estimated cost (1M vectors) ~$25/mo (Supabase) ~$70/mo ~$45/mo ~$35/mo
Native recency/frequency weighting SQL computed columns No No No
Co-located with structured data Same DB Separate service Separate service Separate service

Why pgvector wins for agent memory

The comparison above shows that Pinecone, Weaviate, and Qdrant are excellent for their primary use case — pure vector retrieval at scale. But agent memory is not pure vector retrieval.

1. Structured data is already in PostgreSQL

Your api_keys table, your agents table, your usage quota tracking — all of this already lives in PostgreSQL. When the memory store is the same database, a single query can join vectors with structured metadata:

SQL — join vector similarity with structured filters
SELECT
    m.content,
    m.memory_type,
    m.access_count,
    m.created_at,
    1 - (m.embedding <=> $1) AS similarity
FROM memories m
JOIN agents a ON m.agent_id = a.id
WHERE
    a.api_key_id = $2
    AND m.session_id = $3
    AND (m.expires_at IS NULL OR m.expires_at > now())
ORDER BY
    similarity DESC
LIMIT 20;

Doing this with Pinecone requires two round trips: first query the vector index, then fetch structured data from a separate datastore. With pgvector, it is one query, one network hop, one transaction.

2. Recency and frequency are SQL expressions

The Kronvex confidence score formula is:

Confidence scoring formula
-- confidence = similarity × 0.6 + recency × 0.2 + frequency × 0.2
-- recency: sigmoid with 30-day inflection
-- frequency: log-scaled access count

SELECT
    content,
    (similarity * 0.6)
    + (1 / (1 + EXP(-(EXTRACT(EPOCH FROM (now() - created_at)) / 86400 - 30) / 10))) * 0.2
    + (LN(1 + access_count) / LN(100)) * 0.2
    AS confidence
FROM candidates
ORDER BY confidence DESC
LIMIT 6;

This kind of composite scoring — blending vector similarity with temporal decay and usage frequency — cannot be expressed in Pinecone's metadata filter system or Weaviate's GraphQL. It requires a real expression language. SQL provides that.

3. ACID guarantees matter for agent state

When a user upgrades their plan and you update both the api_keys table and delete old memories (to free up quota), that operation needs to be atomic. If it partially succeeds, your agent's state is inconsistent. PostgreSQL gives you full ACID transactions. Purpose-built vector databases generally do not.

How Kronvex uses pgvector + confidence scoring

Kronvex is built on top of Supabase PostgreSQL with the pgvector extension. Each memory is stored as a 1536-dimensional vector (OpenAI text-embedding-3-small) alongside its structured metadata.

When recall() is called, the query text is embedded in real time, and the database performs an approximate nearest-neighbor (ANN) search using the IVFFlat index. The top-20 candidates are then re-ranked using the confidence score formula above, and the top-k results are returned.

Index choice: Kronvex uses IVFFlat rather than HNSW for the primary index because IVFFlat has lower memory overhead at scale and better write throughput (important for agents that remember frequently). For use cases with <100k vectors per agent where query speed dominates, HNSW would be preferable. See our HNSW deep-dive for the full analysis.

Code example: raw pgvector vs Kronvex API

To illustrate what Kronvex abstracts away, here is the same "store and recall a memory" operation implemented directly against pgvector, then with the Kronvex SDK:

Raw pgvector approach (~80 lines, no scoring)
import asyncpg, openai, json
from datetime import datetime

openai_client = openai.AsyncOpenAI()

async def embed(text: str) -> list[float]:
    r = await openai_client.embeddings.create(
        model="text-embedding-3-small", input=text
    )
    return r.data[0].embedding

async def remember_raw(pool, agent_id: str, content: str):
    vec = await embed(content)
    await pool.execute("""
        INSERT INTO memories (agent_id, content, embedding, created_at)
        VALUES ($1, $2, $3::vector, $4)
    """, agent_id, content, json.dumps(vec), datetime.utcnow())

async def recall_raw(pool, agent_id: str, query: str, top_k: int = 5):
    # No confidence scoring — pure cosine similarity only
    vec = await embed(query)
    rows = await pool.fetch("""
        SELECT content, 1 - (embedding <=> $1::vector) AS sim
        FROM memories
        WHERE agent_id = $2
        ORDER BY sim DESC
        LIMIT $3
    """, json.dumps(vec), agent_id, top_k)
    return [r["content"] for r in rows]

# Missing: session scoping, TTL, access_count updates,
# confidence scoring, memory types, quota enforcement...
Kronvex SDK approach (~10 lines, full confidence scoring)
from kronvex import Kronvex

kv    = Kronvex("kv-your-key")
agent = kv.agent("your-agent-id")

# Store — embedding + metadata handled automatically
agent.remember(
    "User prefers formal tone, no bullet points",
    memory_type="semantic",
    session_id="user-42",
)

# Recall — cosine sim + recency + frequency combined
result = agent.recall(
    query="how should I format my response?",
    top_k=5,
    session_id="user-42",
)

for mem in result.memories:
    print(f"{mem.confidence:.2f} — {mem.content}")
# 0.87 — User prefers formal tone, no bullet points
When to use a dedicated vector DB: If your primary use case is large-scale document retrieval (millions of chunks, multi-tenant RAG pipelines), Pinecone or Qdrant may be the right choice. pgvector's ANN performance degrades above ~5M vectors per table without careful index tuning. For agent memory (typically thousands to low millions of memories per agent), pgvector is the better fit.
Related articles
Free access
Get your API key

100 free memories. No credit card required.

Already have an account? Sign in →