DevOpsArchitectureAWS

Why We Replaced OpenSearch with pgvector (And Saved 80% on Costs)

A frank comparison of three vector database options at production scale — and why the boring choice turned out to be the best one.

Every AI-adjacent project eventually hits the "which vector database" question. In 2024 we hit it three times across different client engagements. Each time the answer surprised us. This is what we learned.

The context

We were building retrieval-augmented systems across three different clients — a logistics TMS, a financial document classifier, and an internal knowledge base for a telco. Each had different scale, latency, and budget constraints. Across all three, we evaluated the same shortlist: Amazon OpenSearch with the k-NN plugin, Pinecone, and pgvector on RDS PostgreSQL.

The results were consistent enough that we now have a default recommendation.

What we measured

We built a standardised benchmark: 2M 1536-dimensional vectors (matching Titan Embeddings v2 output), a query set of 500 real retrieval tasks, and metadata filters applied to roughly 40% of queries. We ran each system for two weeks in a staging environment that mirrored production load patterns.

The metrics we cared about:

  • p50 and p95 query latency
  • Monthly infrastructure cost at steady state
  • Operational complexity (number of moving parts, maintenance burden)
  • Filtered search quality (accuracy when metadata predicates narrow the search space)

The results

Latency

All three systems passed our 200ms p95 budget, but with different margins. pgvector with HNSW indexing landed at 38ms p95. Pinecone was fastest at 22ms p95 — their managed infrastructure is genuinely good. OpenSearch was slowest at 81ms p95, partly because our metadata filters weren't as cleanly supported and required post-filtering rather than pre-filtering.

For the use cases we were building, 38ms vs 22ms is irrelevant. The model inference step dominates at 900ms+.

Cost

This is where pgvector won decisively. At 2M vectors:

  • Pinecone (p2 pod): ~$480/month
  • OpenSearch (r6g.large.search, 3 nodes): ~$390/month
  • pgvector (db.r7g.large RDS): ~$85/month

That's not a rounding error. pgvector at our scale was 4–5× cheaper than either managed vector service. The RDS instance was already costing us money for the relational data — the vector workload added almost nothing marginal.

"The most expensive vector database is the one you spin up separately from your existing database."

Operational complexity

Pinecone is a fully managed service — almost zero operational burden, but it's another vendor, another API key, another service to monitor and contract to manage. OpenSearch with k-NN required careful JVM tuning, segment management, and refresh interval configuration to stay performant. We had two incidents in the two-week window.

pgvector is just a Postgres extension. Our team already operated RDS. Backups, failover, monitoring — all existing tooling applied immediately.

Filtered search

This was the clearest pgvector advantage. Our queries frequently needed to filter by metadata before doing vector similarity — "find the most similar clause, but only within documents from carrier X with contract version > 2023-01".

In pgvector, this is a standard SQL WHERE clause with a vector similarity operator. The query planner handles the interaction between the index and the filter naturally. In Pinecone, metadata filtering exists but with constraints on cardinality. In OpenSearch, combining k-NN with structured filters required careful configuration and we hit accuracy degradation at some filter selectivities.

When we'd still choose something else

pgvector is not always the right answer.

Choose Pinecone if: you have no existing Postgres infrastructure, you're at >50M vectors, or your team has no database operations capability. The managed experience is genuinely excellent.

Choose OpenSearch if: you're already operating an OpenSearch cluster for search/logging workloads and want to co-locate the vector index. Avoid spinning it up fresh just for vectors.

Consider a dedicated vector DB if: your access patterns are purely vector similarity with no structured metadata, you're at very high query rates (>10k/s), or you need multi-tenancy with strong isolation guarantees.

Our default recommendation

For most enterprise RAG systems in the 1M–20M vector range: start with pgvector. You almost certainly already pay for a managed Postgres instance. The HNSW index is solid. The SQL integration for metadata filtering is unmatched. And when you need to migrate to a dedicated vector DB, you'll have real production query patterns to guide the decision — rather than speculating up front.

Five things to take away

  1. At 2M vectors, pgvector is 4–5× cheaper than Pinecone or OpenSearch with comparable latency.
  2. SQL metadata filtering in pgvector is more flexible than any dedicated vector DB's filter API.
  3. Model inference dominates latency budgets — 20ms vs 40ms retrieval rarely matters.
  4. Don't add a new managed service when an extension on your existing database does the job.
  5. Benchmark on your actual data and query patterns before committing — generalised benchmarks rarely reflect your specific filter selectivity.