Vector Database 101: Foundation of All Modern AI Applications
Vector databases were technology that 5 years ago only academic researchers knew. Now in 2026, almost every AI startup uses vector DB as core infrastructure. Pinecone valuation rose to billions. Weaviate, Qdrant, Milvus all got large funding rounds.
But why? What's special about vector DB compared to PostgreSQL or MongoDB? When do you need vector DB, and when can you use a regular database?
This article covers vector databases from basic concepts to choosing the right tool for various use cases.
What Are Vectors and Embeddings
Before discussing vector DB, you need to understand vectors themselves.
A vector in ML context is an array of numbers representing some data. For example, the sentence "I love coding" gets encoded into a 1536-dimensional array: [0.012, -0.045, 0.778, ..., 0.234].
These numbers aren't random. They're generated by an embedding model trained to learn semantic representations. Key concept: sentences with similar meanings have vectors close in that 1536-dimensional "vector space".
Example:
- "I love coding" → vector A
- "I enjoy programming" → vector B (close to A)
- "It's raining heavily today" → vector C (far from A and B)
Distance between vectors A and B is small (similar meaning), distance to C is large (different meaning).
Popular embedding models: OpenAI text-embedding-3 (1536d), Cohere Embed (4096d), or open source like BGE (1024d).
Why Regular Databases Aren't Optimal for Vectors
Suppose you have 1 million documents, each with a 1536-dimensional embedding vector. User inputs a query, you want to find 10 most similar documents.
Naive Approach (PostgreSQL):
SELECT id, content,
embedding <-> '[0.1, 0.2, ...]' as distance
FROM documents
ORDER BY distance ASC
LIMIT 10;
A regular database has to compare query vector with ALL 1 million vectors in the table. Calculate cosine similarity 1 million times. For 1536d vectors, one calculation = 1536 multiply + 1536 add. Total: 3 billion operations per query.
Result: 30-second query. Not acceptable for realtime application.
Vector DB Approach:
Vector DBs use indexing algorithms that trade-off accuracy for speed. Approximate Nearest Neighbor (ANN) algorithms like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index) can return top-k results in milliseconds, with 95-99% accuracy compared to exact search.
Same 1 million vector query: 5-50 milliseconds. Can serve realtime production traffic.
Index Algorithms: HNSW vs IVF vs Flat
HNSW (Hierarchical Navigable Small World)
Most popular currently. Builds multi-layer graph where nodes with similar vectors connect. Search starts from top layer, navigates toward query, drills down to deeper layers.
- Pros: very fast queries, good recall
- Cons: high memory usage (graph stored in RAM), slow build time
- Suitable for: low-latency apps, dataset that fits in RAM
IVF (Inverted File Index)
Cluster vectors into K groups (k-means). Query: find closest cluster first, then exhaustive search within just that cluster.
- Pros: lower memory, fast indexing
- Cons: tuning K clusters can be tricky, accuracy depends on data distribution
- Suitable for: large datasets that don't fit in RAM
Flat (Brute Force)
Compare to all vectors. Slowest but 100% accurate.
- Pros: exact results, no accuracy compromise
- Cons: slow at scale
- Suitable for: small datasets (less than 100k vectors), or ground truth for evaluation
Popular Vector DBs in 2026
Pinecone
Type: Managed cloud, fully serverless
Strengths: Easiest to use, auto-scaling, low ops overhead, multi-tenant features. Industry leader for production deployment.
Weaknesses: Cost can be high at scale (especially storage). Vendor lock-in. Less flexible for custom scoring.
Best for: Production apps needing reliability without managing infrastructure.
Weaviate
Type: Open source, has cloud version
Strengths: Hybrid search built-in (vector + BM25), GraphQL API, strong schema support, multi-modal (text, image).
Weaknesses: Memory hungry, GraphQL can be overkill for simple apps.
Best for: Complex search with multiple data types, hybrid retrieval needs.
Qdrant
Type: Open source (Rust), has cloud version
Strengths: Fast (Rust based), memory efficient, powerful payload filtering (vector search + structured filter), open source with permissive license.
Weaknesses: Younger ecosystem, fewer integrations than Pinecone.
Best for: Self-hosted production needing performance, cost-conscious teams.
Milvus
Type: Open source, has Zilliz cloud (managed)
Strengths: Designed for billion-scale, distributed architecture, GPU acceleration support.
Weaknesses: Complex for small deployments, steep learning curve.
Best for: Massive scale (over 100M vectors), enterprise requirements.
pgvector (PostgreSQL extension)
Type: Extension for PostgreSQL
Strengths: Use existing DB, no separate infrastructure, ACID transactions, JOIN with structured data, mature PostgreSQL ecosystem.
Weaknesses: Performance lower than dedicated vector DB at scale, limited indexing options.
Best for: Apps already using PostgreSQL, datasets less than 10M vectors, simple use cases.
ChromaDB
Type: Open source, embeddable
Strengths: Super easy to prototype, run in-process or client-server, Python-first.
Weaknesses: Less production-tested, lower performance than alternatives at scale.
Best for: Prototyping, local development, RAG on laptop.
Decision Matrix: Pick the Right Vector DB
Simple decision framework:
Have less than 1M vectors and already using PostgreSQL?
Use pgvector. No need to add new infrastructure.
Prototype or hobby project?
ChromaDB or local Qdrant. Setup in minutes.
Production app, want zero ops, budget not an issue?
Pinecone. Pay for convenience.
Self-hosted production, performance + cost matter?
Qdrant or Weaviate. Open source with decent support.
Massive scale (over 100M vectors), enterprise requirements?
Milvus or Pinecone enterprise tier.
Hybrid search important (semantic + keyword + filter)?
Weaviate or Qdrant. Both have hybrid search built-in.
Practical Performance Considerations
1. Vector Dimensionality
Higher dimensions = more accurate semantic representation, but slower search and more memory. Default OpenAI 1536d works well for most cases. For smaller dimensions, use models like BGE-small (384d) or OpenAI text-embedding-3-small with dimensions param set smaller.
2. Quantization
Trick to reduce memory: quantize vectors from float32 to int8 or binary. Save 4-32x memory with minimal accuracy loss. Modern vector DBs support quantization built-in.
3. Batch Operations
Inserting vectors one at a time = slow. Batch 100-1000 at once = much faster. Same for queries: if you have many parallel queries, batch queries are more efficient.
4. Pre vs Post Filter
When you need "vector search AND filter by metadata" (e.g., only English documents), some vector DBs efficiently filter first then search (Qdrant), some search first then filter (less ideal). Check specific implementation.
Cost Reality Check
Vector DBs aren't cheap at scale. Rough estimate per 1 million 1536-d vectors:
- Pinecone: $70-150/month depending on tier
- Qdrant Cloud: $30-80/month
- Weaviate Cloud: $50-100/month
- Self-hosted Qdrant on Hetzner VPS: ~$30/month for 1M vectors
Plus embedding cost: OpenAI text-embedding-3-small is $0.02 per 1M tokens. For 1M documents averaging 500 tokens = 500M tokens = $10 once for indexing. Re-indexing on model upgrade = repeat cost.
Closing
Vector databases have become important new infrastructure in the AI era. Different from relational DBs designed 50 years ago, vector DBs are designed optimized for similarity search in high-dimensional space.
For most Indonesian developers: if you're building an MVP or internal tool, use pgvector. You already have PostgreSQL, no extra service needed. If scale becomes serious (over 1M vectors or low-latency requirement), evaluate moving to dedicated vector DB.
What's certain, knowledge of vector DBs is becoming increasingly required for backend developers in 2026. Even if you don't build AI apps yourself, the tools you use (Notion, Linear, Cursor) likely use vector DB behind the scenes.