Vector Database
A database optimized for storing, indexing, and querying high-dimensional vectors, commonly used to power semantic search and RAG in AI applications.
What is a vector database?#
A vector database is a specialized database designed to store and search high-dimensional vectors efficiently. While traditional databases excel at exact-match queries ("find all users where email = X"), vector databases excel at similarity queries ("find the 10 vectors most similar to this one").
When you embed text into a vector — a list of hundreds or thousands of numbers — you need somewhere to store it and a fast way to search through millions of stored vectors. That's what vector databases do. They use specialized indexing algorithms (like HNSW, IVF, or product quantization) to make nearest-neighbor search fast, even across millions or billions of vectors.
Popular vector databases include:
- Pinecone — fully managed, serverless option
- Weaviate — open-source with hybrid search
- Qdrant — open-source, Rust-based, high performance
- Chroma — lightweight, popular for prototyping
- pgvector — PostgreSQL extension for teams already using Postgres
You can also use vector search features built into existing databases like Supabase (pgvector), MongoDB Atlas Vector Search, or Elasticsearch. These are practical when you don't want to manage a separate database just for vectors.
Why it matters for AI agents#
Vector databases are the backbone of retrieval-augmented generation (RAG), which is how most AI agents access knowledge beyond their training data. Without a vector database, an agent's knowledge is limited to what fits in its context window and what it memorized during training.
For email agents, vector databases enable several key capabilities. An agent can store embeddings of every email it has processed, creating a searchable semantic index of all past communications. When a new email arrives asking about a topic discussed three months ago, the agent searches its vector database, retrieves the relevant thread, and responds with full context.
The workflow is straightforward: embed incoming emails and store the vectors. When context is needed, embed the query, perform a similarity search, retrieve the top matches, and inject them into the agent's prompt. This pattern lets agents maintain institutional knowledge across thousands of conversations without exceeding context window limits.
Vector databases also support metadata filtering, which is critical for email agents. You can filter similarity searches by sender, date range, thread ID, or label — narrowing results before the vector comparison runs. This means an agent can search for "messages similar to this complaint, from this specific customer, in the last 30 days" in a single query.
Frequently asked questions
Do I need a dedicated vector database or can I use pgvector?
For most agent applications, pgvector or a similar extension in your existing database is sufficient and simpler to manage. Dedicated vector databases become worthwhile when you have millions of vectors, need sub-millisecond search latency, or require advanced features like automatic reindexing. Start simple and migrate if you hit limits.
How much storage do vector embeddings require?
A single 1,536-dimensional embedding (common for OpenAI models) takes about 6 KB. One million embeddings would require roughly 6 GB of vector storage, plus metadata and index overhead. For email agents processing thousands of messages per day, storage is rarely the bottleneck — search speed and index management matter more.
Can vector databases replace traditional search?
Not entirely. Vector search excels at finding semantically similar content but can miss exact matches. Traditional keyword search is better for finding specific terms, IDs, or exact phrases. Most production systems use hybrid search — combining vector similarity with keyword matching — to get the best of both approaches.
What is the difference between a vector database and an embedding?
An embedding is a numerical representation (vector) of a piece of text, generated by an AI model. A vector database is the storage and search system that holds millions of these embeddings and lets you efficiently find the most similar ones to a query. The database indexes and queries embeddings; it does not create them.
How do email agents use vector databases?
Email agents store embeddings of processed emails in a vector database, creating a searchable semantic index of all past communications. When a new email arrives about a topic discussed months ago, the agent embeds the query, searches for similar past emails, and retrieves relevant context to inform its response.
What is hybrid search in vector databases?
Hybrid search combines vector similarity search with traditional keyword search in a single query. This catches both semantically similar results (different words, same meaning) and exact keyword matches. Most production RAG systems use hybrid search to improve retrieval accuracy over either method alone.
How fast is vector database search?
Most vector databases return results in single-digit milliseconds for collections of millions of vectors, using approximate nearest neighbor algorithms like HNSW. Search speed depends on the index type, vector dimensionality, and collection size. For email agents, latency is rarely an issue since email processing is not sub-millisecond sensitive.
What is metadata filtering in vector databases?
Metadata filtering lets you narrow vector search results by structured attributes before or during the similarity comparison. For email agents, this means searching for "emails similar to this complaint, from this specific sender, in the last 30 days" by combining vector similarity with metadata constraints in a single query.
How often should email agents update their vector database?
Agents should embed and store new emails as they are processed, keeping the vector database current. Batch updates work for non-time-sensitive applications, but real-time indexing is better for agents that need immediate access to recent context. Stale indexes mean the agent cannot reference recent conversations.
What are the main vector database indexing algorithms?
The most common is HNSW (Hierarchical Navigable Small World), which offers fast search with high recall. IVF (Inverted File Index) is another option that trades some accuracy for lower memory usage. Product quantization compresses vectors to reduce storage. The choice depends on your balance of speed, accuracy, and memory constraints.