pgvector
What is pgvector?
Pgvector is an open-source Postgres extension for developers who want vector similarity search inside an existing database. It supports exact and approximate nearest-neighbor search, multiple vector types, and distance functions including L2, inner product, and cosine distance, plus quantization for tighter storage and faster retrieval. Teams use it with GitHub APIs and webhooks, and customers include Shopify and Spotify. Plans run Free $0USDper user/month, Team $4USDper user/month, and Enterprise $21USDper user/month.
Last verifiedHow we evaluate
At a glance
- Pgvector is best for developers who want vector search inside Postgres without adding a separate database.
- Free $0USDper user/mo; Team $4USDper user/mo; Enterprise $21USDper user/mo
- 30 days, no credit card
- Yes — GitHub advertises APIs and webhooks for getting data and events, and for automating workflows within GitHub.
What does pgvector do?
Pgvector adds vector similarity search to Postgres so teams can keep embeddings and relational data in one database. It supports exact and approximate nearest-neighbor search, multiple vector types, and distance functions such as L2, inner product, cosine, L1, Hamming, and Jaccard, plus quantization for tighter storage and faster retrieval. That lets developers build semantic search, recommendation, and retrieval workflows without moving data into a separate vector store. On GitHub, the repository shows 21.3k stars, 1.2k forks, 128 branches, and 38 tags, which signals an active maintenance surface. Because it lives in Postgres, it works naturally with API-driven and automation-heavy workflows around GitHub, Docker, and package managers, while self-hosting remains available for teams that want to run it on their own infrastructure.
Why use pgvector?
- Keeping vectors in Postgres reduces system sprawl when teams already rely on the database for transactional data.
- Exact and approximate search give teams a path from correctness-first prototypes to faster production retrieval.
- Support for multiple vector types and distance metrics lets teams match storage and scoring to the workload.
- Quantization can help reduce memory and storage pressure when vector collections grow.
- Open-source licensing and self-hosting make it easier to fit strict infrastructure or control requirements.
Who is pgvector for?
- Backend engineers who want semantic search inside an existing Postgres stack.
- Data platform teams who need vector retrieval alongside relational queries.
- Product engineers building recommendation or similarity features from embeddings.
- Infrastructure teams that prefer self-hosted software with database-native operations.
What are pgvector's key features?
single-precision, half-precision, binary, and sparse vectors
Store single-precision, half-precision, binary, and sparse vectors in Postgres, letting you match embedding format to memory and performance needs.
L2 distance
Rank vector similarity with L2 distance in Postgres, which is useful for embedding search where Euclidean closeness matters.
cosine distance
Compare embeddings with cosine distance in Postgres, helping teams search by directional similarity instead of raw magnitude.
quantization
Apply quantization to reduce vector storage and speed up search, which matters when indexing large embedding sets in Postgres.
APIs
Automate data and event handling through GitHub APIs and webhooks, so teams can connect pgvector workflows to external systems and scripts.
What does pgvector integrate with?
- Docker
- Homebrew
- PGXN
- APT
- Yum
- pkg
- APK
- conda-forge
- Postgres.app
- GitHub Actions
- Okta
- Entra ID
- GitHub Copilot
- GitHub Enterprise Cloud
- GitHub Issues
- Codespaces
- Dependabot
- Render
- LovableBot
- OpenCode
- Linear Code
- CodeRabbit
- Linear
- Azure Pipelines
- Codecov | Code Coverage
- CircleCI
- Rewind Backups for GitHub
- CodeFactor
- Imgbot
- Zenhub
What are pgvector's use cases?
Semantic search in Postgres
Backend engineers who want semantic search inside an existing Postgres stack use pgvector to store embeddings and run exact and approximate nearest neighbor search without adding a separate vector database. They can compare results with cosine distance or inner product to return relevant matches fast.
Similarity features for products
Product engineers building recommendation or similarity features from embeddings use pgvector to rank items directly in Postgres, using single-precision, half-precision, binary, and sparse vectors to fit different model outputs. They tune retrieval with L2 distance and quantization to balance relevance and latency.
Vector retrieval for data platforms
Data platform teams who need vector retrieval alongside relational queries use pgvector to keep embeddings in the same database as their business tables. With exact and approximate nearest neighbor search plus APIs, they can join semantic results to transactional data in one query path.
Self-hosted vector search operations
Infrastructure teams that prefer self-hosted software with database-native operations use pgvector to keep vector search inside Postgres they already run. They rely on exact and approximate nearest neighbor search and cosine distance to deliver retrieval features without introducing a new managed service.
How does pgvector work?
- Install pgvector in Postgres 13+ using your preferred package path, such as Docker, Homebrew, APT, or PGXN, then enable the extension in the target database.
- Create a vector column and choose the right representation from single-precision, half-precision, binary, or sparse vectors to match your embedding model and storage needs.
- Load embeddings, then query them with exact and approximate nearest neighbor search using L2 distance, inner product, cosine distance, L1 distance, Hamming distance, or Jaccard distance.
- Apply quantization where needed to reduce memory and speed retrieval, then expose the search through your application APIs or database queries.
- Operationalize the setup with GitHub Actions, APIs, and Webhooks so indexing, refreshes, and retrieval logic stay in sync as your data changes.
How much does pgvector cost?
Free
$0USDper user/month- Unlimited public/private repositories
- Dependabot security and version updates
- 2,000 CI/CD minutes/month
- 500MB of Packages storage
- Issues & Projects
- Community support
Team
$4USDper user/month- Everything included in Free, plus.
- Access to GitHub Codespaces
- Repository rules
- Draft pull requests
- Code owners
- Required reviewers
- Pages and Wikis
- Environment deployment branches and secrets
- 3,000 CI/CD minutes/month
- 2GB of Packages storage
- Web-based support
Enterprise
$21USDper user/month- Everything included in Team, plus.
- Data residency
- Enterprise Managed Users
- User provisioning through SCIM
- Enterprise Account to centrally manage multiple organizations
- Environment protection rules
- Repository rules
- Audit Log API
- SOC1, SOC2, type 2 reports annually
- FedRAMP Tailored Authority to Operate (ATO)
- SAML single sign-on
- Auditing
- GitHub Connect
- 50,000 CI/CD minutes/month
- 50GB of Packages storage
Frequently asked questions
What is pgvector?
Pgvector is an open-source Postgres extension for developers who want vector similarity search inside an existing database. It supports exact and approximate nearest-neighbor search, multiple vector types, and distance functions including L2, inner product, and cosine distance, plus quantization for tighter storage and faster retrieval. Teams use it with GitHub APIs and webhooks, and customers include Shopify and Spotify. Plans run Free $0USDper user/month, Team $4USDper user/month, and Enterprise $21USDper user/month.
How much does pgvector cost? Is it free?
Pgvector has a free plan, with paid tiers including Team at $4USDper user/month, Enterprise at $21USDper user/month. A 30-day free trial is available.
What is pgvector used for? Who is it for?
Pgvector is used for exact and approximate nearest neighbor search, single-precision, half-precision, binary, and sparse vectors, and L2 distance. It's built for Backend engineers, Data platform teams, and Product engineers building recommendation or similarity features from embeddings.
Does pgvector have an API and what does it integrate with?
GitHub advertises APIs and webhooks for getting data and events, and for automating workflows within GitHub. It integrates with Docker, Homebrew, PGXN, APT, Yum, and 25 more.
Editor's read
Check whether your embedding workload needs the Enterprise tier's data residency, SCIM provisioning, or SAML single sign-on. Those controls are only listed on Enterprise, so teams with compliance or identity requirements should verify the upgrade path before standardizing on lower tiers.
