Skip to main content
Favicon of Chroma

Chroma

What is Chroma?

Chroma is AI search infrastructure for developers who need retrieval that can start locally and scale to cloud. It combines sparse vector search, lexical search, full-text search, metadata filtering, and trigram and regex search, with client libraries and a CLI for local development. The same collections move into Chroma Cloud, and customers include Capital One, Mintlify, and UnitedHealthcare. Plans run Starter $0/month, Team $250/month, and Enterprise custom.

Last verifiedHow we evaluate

Screenshot of Chroma website

At a glance

Best for
Chroma is best for developers who need fast, scalable search infrastructure for AI apps.
Pricing
Starter $0/mo; Team $250/mo; Enterprise Custom
API
Yes — Chroma provides developer docs and client libraries for Python and JavaScript/TypeScript, with search and collection APIs.

What does Chroma do?

Chroma handles AI search by combining sparse vector search, lexical search, and metadata filtering on infrastructure built for object storage. Developers can start locally with the CLI and client libraries, then move the same collections into Chroma Cloud for serverless operation. The system supports full-text, trigram, and regex search, plus forking for dataset versioning, A/B testing, and roll-outs, so teams can iterate on retrieval without rebuilding their stack. At scale, Chroma reports fast queries over billions of multi-tenant indexes, with p50 latency of 20ms, p90 of 27ms, and p99 of 57ms on warm queries. It also cites 30 MB/s write throughput per collection, 2000+ QPS write throughput per collection, 1M collections per database, and 5M records per collection. The cloud is SOC 2 Type II assessed, and the enterprise offering adds single-tenant clusters, BYOC clusters, and SLAs. Customers shown on the site include Capital One, Mintlify, UnitedHealthcare, Propel, Weights & Biases, and Factory.

Why use Chroma?

  • It combines sparse, lexical, vector, and metadata search so teams can keep retrieval in one system instead of stitching tools together.
  • Its object-storage architecture and automatic data tiering reduce infrastructure work while keeping search fast at scale.
  • The same code path works locally and in Chroma Cloud, which shortens experimentation before production deployment.
  • Enterprise options include single-tenant clusters, BYOC, and SLAs for teams with stricter operational requirements.
  • Public performance numbers show warm queries in the tens of milliseconds and high write throughput per collection.

Who is Chroma for?

  • AI application developers who need retrieval infrastructure that can start locally and scale to cloud.
  • Platform teams who want serverless search with low operational overhead.
  • Data and ML engineers who need sparse, vector, and metadata search in one system.
  • Enterprise teams who need deployment and security options for production AI workloads.
  • Product teams who want to test retrieval changes through collection forking and roll-outs.

What are Chroma's key features?

Sparse vector search

Search with sparse vectors for semantic retrieval across large indexes, including billions of multi-tenant indexes, while keeping query latency low.

Lexical search

Run lexical retrieval with BM25 and SPLADE to match exact terms and ranked text relevance, which helps teams tune search quality for production apps.

Vector search

Store and query embeddings through the collection API for similarity search, with support for 5M records per collection and 90-100% recall.

Full-text search

Index and query full text alongside metadata, giving teams one search layer for document lookup, filtering, and retrieval workflows.

Trigram and regex search

Use trigram and regex matching for pattern-heavy queries, which is useful when exact substrings or structured text rules matter.

Metadata search

Filter collections by metadata fields through the search and collection APIs, so applications can narrow results without extra database joins.

Low latency search

Serve queries with p50 20ms, p90 27ms, and p99 57ms latency, helping user-facing AI apps stay responsive under load.

Zero-ops infra

Run Chroma as managed infrastructure with auto-scaling, no manual tuning, and serverless pricing, reducing operational work for production teams.

What does Chroma integrate with?

  • Slack
  • GitHub
  • AWS PrivateLink
  • S3
  • GCS

What are Chroma's use cases?

AI app retrieval stack

AI application developers use Chroma to power retrieval for chatbots and RAG apps, combining Vector search with Metadata search to pull the right context fast. They can start locally, then move to production without reworking the retrieval layer, keeping latency low as usage grows.

Serverless search for platforms

Platform teams use Chroma to offer search as a managed building block, relying on Zero-ops infra and Auto-scales with usage to avoid tuning clusters or capacity planning. That lets them ship search-backed features quickly while keeping operational overhead and maintenance work low.

Hybrid search for data teams

Data and ML engineers use Chroma to combine Sparse vector search, Lexical search, and Full-text search in one system when they need precise retrieval across messy corpora. They can also use Trigram and regex search to handle exact-match edge cases without stitching together separate tools.

Retrieval experiments for product teams

Product teams use Chroma to test retrieval changes safely with Forking, comparing new collection versions before rollout. That makes it easier to validate search quality improvements and ship updates with less risk to production relevance.

How does Chroma work?

  1. Connect your first data source or collection through the API, then load documents and embeddings into Chroma using the Python or JavaScript/TypeScript client libraries.
  2. Choose the search modes you need, such as Vector search, Sparse vector search, Lexical search, or Full-text search, and add Metadata search filters for tighter retrieval.
  3. Run queries against your collection and inspect results in the docs-backed search and collection APIs, using Low latency search to keep responses fast during development and production.
  4. Use Forking to branch a collection, compare retrieval changes, and promote the version that performs best before rollout.
  5. Move to production with Zero-ops infra, then rely on Auto-scales with usage, Deployment options, and Security features like Private Networking or Customer-Managed Encryption Keys as needed.

How much does Chroma cost?

Starter

$0/month
  • $0 + usage
  • 10 databases
  • 10 team members
  • Community Slack
  • Get up and running quickly. Free credits then usage-based pricing.

Team

$250/month
  • $250 + usage
  • 100 databases
  • 30 team members
  • Slack support
  • SOC II
  • Volume-based discounts
  • Scale your production use cases. $100 credits then usage-based pricing.

Enterprise

Custom
  • Custom pricing
  • Unlimited databases
  • Dedicated support
  • Single tenant clusters
  • BYOC clusters
  • SLAs

Frequently asked questions

What is Chroma?

Chroma is AI search infrastructure for developers who need retrieval that can start locally and scale to cloud. It combines sparse vector search, lexical search, full-text search, metadata filtering, and trigram and regex search, with client libraries and a CLI for local development. The same collections move into Chroma Cloud, and customers include Capital One, Mintlify, and UnitedHealthcare. Plans run Starter $0/month, Team $250/month, and Enterprise custom.

How much does Chroma cost? Is it free?

Chroma has a free plan, with paid tiers including Team at $250/month, Enterprise at Custom.

What is Chroma used for? Who is it for?

Chroma is used for Sparse vector search, Lexical search, and Vector search. It's built for AI application developers, Platform teams, and Data and ML engineers.

Does Chroma have an API and what does it integrate with?

Chroma provides developer docs and client libraries for Python and JavaScript/TypeScript, with search and collection APIs. It integrates with Slack, GitHub, AWS PrivateLink, S3, GCS.

Editor's read

Check the database and team-member ceilings on Starter and Team before committing. Starter includes 10 databases and 10 team members, while Team raises that to 100 databases and 30 team members; larger deployments move to Enterprise.

Share:

Sponsored
Favicon

 

  
 

Explore other Agent Tools & Integrations

Favicon

 

  
  
Favicon

 

  
  
Favicon