Zep

What is Zep?

Zep is a context infrastructure platform for AI teams that assembles memory, business data, and user behavior into reusable context blocks for agents. It includes Agent Memory, Graph RAG, Intelligent Context Assembly, and Custom Entity and Edge Filtering, and integrates with LangChain, LangGraph, OpenAI, Anthropic, Azure OpenAI, and Google Gemini. Plans run Flex $125/month, Flex Plus $375/month, and Enterprise custom.

Last verifiedMay 17, 2026How we evaluate

Visit Zep

At a glance

Best for: Zep is best for AI teams who need persistent, low-latency context for agents.
Pricing: Flex $125/mo; Flex Plus $375/mo; Enterprise Custom
API: Yes — The page advertises a Graph Search API and context template creation via the client, with API documentation linked.

What does Zep do?

Zep assembles optimized context from chat history, business data, and user behavior so agents can respond with the right information at the right time. Its context assembly flow pulls from memory and business records, then formats the result into reusable context blocks with Dynamic Relevance Ranking, Memory Context Blocks, and Context Templates. For teams that need more control, the Graph Search API and Custom Entity and Edge Filtering let you shape exactly what gets surfaced without hand-crafting prompts. At scale, Zep is built for low-latency retrieval and changing data: the site shows under-200ms retrieval, <200ms P95 retrieval latency, and real-time incremental updates instead of batch recomputation. It supports a unified context graph for evolving facts, and the open-source Graphiti layer reports 100%+ accuracy improvements, 90% latency reduction, and 98% fewer tokens required for processing. Customers shown on the site include Twin Health, Praktika.ai, Writer, Samsung, and HoneyBook.

Why use Zep?

It replaces manual prompt crafting with automated context assembly, reducing the work needed to keep agent inputs relevant.
Its temporal graph approach keeps historical context and current facts together, which helps agents handle changing user and business states.
Low-latency retrieval under 200ms supports voice and other latency-sensitive experiences.
Enterprise deployment options include managed, BYOK, BYOM, and BYOC, giving teams control over security and infrastructure.
The platform supports real-time incremental updates, so changing records do not require batch recomputation.

Who is Zep for?

AI product teams who need agents to remember users across sessions and interactions.
Developers building context-aware workflows who want graph-based retrieval instead of manual prompt assembly.
Engineering leaders who need low-latency, production-ready context infrastructure for dynamic business data.
Teams handling sensitive data who need flexible enterprise deployment and compliance controls.

What are Zep's key features?

Agent Memory

Stores long-term memory from chat history and user behavior, then retrieves it in under 200ms P95 so agents can keep context across sessions.

Graph RAG

Uses a unified context graph and Graph Search API to retrieve related facts and relationships, improving answer quality with millisecond query responses.

Intelligent Context Assembly

Assembles context from memory, business data, and user behavior with dynamic relevance ranking, reducing token use by up to 98%.

Knowledge Graph MCP

Connects context graph data through a Knowledge Graph MCP workflow, with context templates created via the client for structured retrieval.

Managed Enterprise

Supports SOC 2 Type II, HIPAA BAA, audit logs, and 30+ day API logs, giving teams the controls needed for regulated deployments.

Bring Your Own Key

Lets teams run with AWS KMS and CloudTrail while keeping deployment options flexible, including managed, BYOK, BYOM, or BYOC.

Framework Integration

Works with LangChain, LangGraph, OpenAI, Anthropic, Azure OpenAI, and Google Gemini so teams can plug memory into existing agent stacks.

Custom Entity and Edge Filtering

Lets teams define custom entity types and edge types, then filter graph retrieval for cleaner results across complex business relationships.

What does Zep integrate with?

LangChain
LangGraph
OpenAI
Anthropic
AWS KMS
CloudTrail
Google
Azure
Claude
Cursor
Neo4j
Azure OpenAI
Google Gemini

What are Zep's use cases?

Agent memory for product teams

AI product teams who need agents to remember users across sessions use Zep to keep prior preferences, goals, and decisions available in later conversations. They rely on Agent Memory and Persistent Context to make follow-up answers feel continuous, reducing repeated questions and improving task completion.

Graph retrieval for developers

Developers building context-aware workflows use Zep to replace manual prompt assembly with Graph RAG and Intelligent Context Assembly. They can pull the most relevant relationships and facts into each request, which helps agents answer with better grounding and fewer missed dependencies.

Production context for engineering leaders

Engineering leaders use Zep to power low-latency context infrastructure for dynamic business data, combining Dynamic Relevance Ranking with Lightning Fast Retrieval. That lets production agents respond quickly while keeping context fresh as records, events, and relationships change.

Enterprise controls for sensitive data

Teams handling sensitive data use Zep to deploy context infrastructure with Managed Enterprise and Bring Your Own Key (BYOK). They can keep governance tighter while still supporting retrieval and memory for regulated workflows and internal assistants.

How does Zep work?

Connect your first data source or chat stream, then define what Zep should remember using Agent Memory and Context Templates. Start with the conversations, records, or events that matter most.
Map entities and relationships with Custom Entity Types and Custom Entity and Edge Filtering, so Zep can build a useful graph instead of storing raw text alone.
Let Intelligent Context Assembly and Dynamic Relevance Ranking select the right facts for each request, reducing manual prompt assembly and keeping token usage focused.
Use the Graph Search API or Framework Integration with LangChain, LangGraph, or OpenAI to wire context into your app's runtime and agent workflows.
Monitor retrieval quality, latency, and updates in the API logs and analytics, then refine templates, filters, and data sources as your product evolves.

How much does Zep cost?

Flex

$125/month

50,000 Credits per month
Auto-topup at 20%. 30-day rollover.
600 requests per minute
5 Projects
10 custom entity & edge types
API logs (1 day)
Unlimited memories, retrieval & users

Flex Plus

$375/month

200,000 Credits per month
Auto-topup at 20%. 60-day rollover.
1,000 requests per minute
10 Projects
20 custom entity & edge types
Custom extraction instructions
Webhooks
Analytics
Observations(coming soon)
API logs (7 days)
Unlimited memories, retrieval & users

Enterprise

Custom

Custom credits with negotiated rates
Guaranteed rate limits with SLA
Unlimited projects and entity & edge types
SOC 2 Type II & HIPAA BAA
Audit logs & 30+ day API logs
Teams and Slack support & dedicated account manager
Managed, BYOK, BYOM, or BYOC deployment

Frequently asked questions

What is Zep?

How much does Zep cost? Is it free?

Zep has 3 paid plans: Flex at $125/month, Flex Plus at $375/month, Enterprise at Custom.

What is Zep used for? Who is it for?

Zep is used for Agent Memory, Graph RAG, and Intelligent Context Assembly. It's built for AI product teams, Developers building context-aware workflows, and Engineering leaders.

Does Zep have an API and what does it integrate with?

The page advertises a Graph Search API and context template creation via the client, with API documentation linked. It integrates with LangChain, LangGraph, OpenAI, Anthropic, AWS KMS, and 8 more.

Editor's read

Check the request-rate and log-retention limits on the tier you plan to buy. Flex includes 600 requests per minute and 1-day API logs, while Flex Plus raises that to 1,000 requests per minute and 7-day logs; Enterprise is the only tier with SLA-backed limits and 30+ day API logs.

Filed under:Agent Tools & Integrations byok hipaa open-source soc2

Explore other Agent Tools & Integrations

Browse Agent Tools & Integrations

Milvus

Open-source vector database for fast embedding search at scale.

Agent Tools & Integrations

Milvus turns embeddings into fast similarity search, with Lite, Standalone, and Distributed deployments. Used by Salesforce and Reddit.

LangSmith

Trace, evaluate, and deploy agents in one platform.

Agent Tools & Integrations

LangSmith traces and evaluates AI agents with Observability, Evaluation, and Deployment. Plans start at $0 / seat per month.

Mastra

TypeScript framework for building, observing, and deploying AI agents.

Agent Tools & Integrations

Mastra is a TypeScript framework for AI agents with Observability, Studio, and Memory Gateway. Plans run Free, Pro custom, and Enterprise custom.

AgentPhone

Phone infrastructure for AI agents handling calls and texts.

Agent Tools & Integrations

AgentPhone routes calls and texts through one webhook, with real-time transcription and Native MCP support. Plans start at $3/per number/month.

Guardrails AI

Test, govern, and protect LLM apps before release.

Agent Tools & Integrations

Guardrails AI tests and protects LLM apps with simulation, evals, and runtime policy checks for any LLM and deployment option.