Langfuse

What is Langfuse?

Langfuse is an AI observability platform for AI product teams that combines tracing, prompt management, evaluations, and analytics dashboards in one workflow. It includes LLM Observability, Prompt Management, Evaluation, Playground, and Human Annotation, and integrates with OpenAI, LangChain, OpenTelemetry, and GitHub. Customers include Canva, Twilio, Adobe, and Khan Academy. Plans run Hobby free, Core $29/month, Pro $199/month, and Enterprise $2499/month.

Last verifiedMay 17, 2026How we evaluate

Visit Langfuse

At a glance

Best for: Langfuse is best for AI teams who need tracing, prompt control, and evaluation in one workflow.
Pricing: Hobby Free; Core $29/mo; Pro $199/mo; Enterprise $2499/mo
API: Yes — Langfuse is API-first and offers a public API plus SDK access for tracing, prompts, and evaluation scores.

What does Langfuse do?

Langfuse handles the LLM engineering loop by combining tracing, prompt management, evaluations, and analytics dashboards in one workflow. Teams can ingest traces through the SDKs or supported frameworks, inspect where prompts and model calls behave unexpectedly, and use the Playground and Experiments to compare changes before rollout. The result is a tighter debug-and-improve cycle for AI applications and agents. At scale, Langfuse is used by 2,300+ companies, 100,000+ engineers, and 19 of Fortune 50, with 10+ billion observations processed per month. It is API-first, with a public API plus SDK access for tracing, prompts, and evaluation scores, and it also supports self-hosting through Docker Compose, Kubernetes, and cloud Terraform guides. Customers named on the site include Canva, Twilio, Adobe, Khan Academy, Intuit, and Cisco.

Why use Langfuse?

Open-source and self-hostable, so teams can keep observability data on their own infrastructure.
API-first access lets developers automate tracing, prompts, and evaluation scores instead of working only in a UI.
The platform combines tracing, prompt management, evaluations, and analytics, reducing tool sprawl in the LLM workflow.
Scale signals are strong: 2,300+ companies, 100,000+ engineers, and 10+ billion observations per month.
Enterprise tiers add audit logs, SCIM API, uptime SLA, and dedicated support for larger rollouts.

Who is Langfuse for?

AI product teams who need to debug and improve application behavior across the full development loop.
Platform engineers who want API-first tracing and evaluation data they can wire into existing systems.
ML engineers who compare prompt and model changes before shipping them to users.
Security-conscious teams who need self-hosting and compliance-oriented deployment options.

What are Langfuse's key features?

LLM Observability

Trace prompts, generations, and evaluation scores through the public API and SDKs, helping teams debug production LLM behavior at scale.

Prompt Management

Manage prompts centrally with the API-first platform and SDK access, so teams can version changes and ship updates without code churn.

Evaluation

Run evaluations on traced outputs and prompt changes using evaluation scores from the public API, making quality checks repeatable across releases.

Metrics

Track cost and latency alongside usage signals, giving teams a clear view of model spend and response time across 10+ billion observations/month.

Playground

Test prompts and model outputs in a controlled workspace before release, using the same tracing and SDK-backed data Langfuse captures in production.

Human Annotation

Review and label traces with annotation workflows, then connect feedback to evaluation scores and observability data for better model tuning.

Integrations

Connect Langfuse with OpenAI, LangChain, OpenTelemetry, and GitHub, plus SDKs in Python, TypeScript, Go, Java, and.NET.

Security & Compliance

Support regulated deployments with SOC2, ISO27001, and BAA availability for HIPAA, plus self-hosting for teams that need more control.

What does Langfuse integrate with?

OpenAI
LangChain
Python
TypeScript
Go
Java
.NET
Ruby
PHP
Swift
Vercel AI SDK
LiteLLM
Pydantic AI
Google ADK
CrewAI
LiveKit
Anthropic
Amazon Bedrock
Azure OpenAI
Mistral AI
Google Gemini
xAI
vLLM
Groq
Claude Code
OpenClaw
Claude Agent SDK
OpenWebUI
Ollama
OpenAI Agents SDK

What are Langfuse's use cases?

AI product debugging loop

AI product teams use Langfuse to trace failures across prompts, models, and user inputs, using LLM Observability to spot where an assistant drifts or breaks. They pair it with Evaluation to compare changes before shipping, so they can fix bad answers before customers see them.

Prompt testing for ML engineers

ML engineers use Langfuse to test prompt and model variants in a controlled workflow, using Playground to try ideas and Experiments to compare outcomes. With Metrics, they can choose the version that improves answer quality without increasing latency or cost.

API-first tracing for platform teams

Platform engineers use Langfuse to wire tracing and evaluation data into existing systems, relying on Integrations and the public API to keep observability inside their current stack. That gives them a shared view of production behavior without rebuilding internal tooling.

Compliance-ready deployment

Security-conscious teams use Langfuse to keep sensitive AI workflows under their own control, combining Security & Compliance with self-hosting options. They can centralize prompts, traces, and annotations while meeting internal deployment requirements and audit expectations.

How does Langfuse work?

Connect your first app or model through the public API or an SDK, then start sending traces into LLM Observability so Langfuse can capture prompts, outputs, and latency from real requests.
Organize and version prompts in Prompt Management, then use Playground to test edits before they reach users. Keep the best-performing variants ready for deployment.
Run Evaluation and Experiments on live or sampled traces to compare model changes, score outputs, and identify regressions. Use Metrics to track quality, cost, and latency over time.
Add Human Annotation for edge cases and review queues, then feed those labels back into your evaluation workflow. This helps teams turn subjective feedback into repeatable decisions.
Connect Integrations to your stack, share results with teammates, and keep iterating from the same workspace. Security & Compliance and self-hosting support ongoing governance as usage grows.

How much does Langfuse cost?

Hobby

Free

All platform features (with limits)
50k units / month included
30 days data access
2 users
Community support via GitHub

Core

$29/month

Everything in Hobby
100k units / month included, additional: $8/100k units. Lower with volume ( pricing calculator)
90 days data access
Unlimited users
In-app support

Pro

$199/month

Everything in Core
100k units / month included, additional: $8/100k units. Lower with volume ( pricing calculator)
3 years data access
Data retention management
Unlimited annotation queues
High rate limits
SOC2 & ISO27001 reports, BAA available (HIPAA)
Prioritized in-app support

Enterprise

$2499/month

Everything in Pro + Teams
100k units / month included, additional: $8/100k units. Lower with volume ( pricing calculator)
Audit Logs
SCIM API
Custom rate limits
Uptime SLA
Dedicated support engineer

Frequently asked questions

What is Langfuse?

How much does Langfuse cost? Is it free?

Langfuse has a free plan, with paid tiers including Core at $29/month, Pro at $199/month, Enterprise at $2499/month.

What is Langfuse used for? Who is it for?

Langfuse is used for LLM Observability, Prompt Management, and Evaluation. It's built for AI product teams, Platform engineers, and ML engineers.

Does Langfuse have an API and what does it integrate with?

Langfuse is API-first and offers a public API plus SDK access for tracing, prompts, and evaluation scores.

Editor's read

Check the usage-based unit allowance before rollout: Hobby includes 50k units/month, Core and Pro include 100k units/month with $8 per additional 100k units, and Enterprise keeps the same base allowance. If your trace volume or retention needs exceed those limits, the monthly bill and data access window change quickly.

Filed under:Agent Tools & Integrations freemium gdpr hipaa iso-27001 open-source

Explore other Agent Tools & Integrations

Browse Agent Tools & Integrations

Qdrant

Vector database for AI search with hybrid retrieval and filtering.

Agent Tools & Integrations

Qdrant is a vector database with hybrid search, metadata filtering, and OpenAPI v3 clients. Plans include Free Tier and usage-based Standard Tier.

Portkey

AI gateway for observability, guardrails, prompts, and key management.

Agent Tools & Integrations

Portkey routes LLM traffic with observability, guardrails, and prompt management. Plans start at Free Forever, then $49/month.

Pinecone

Vector retrieval infrastructure for search, RAG, and agents.

Agent Tools & Integrations

Pinecone handles vector retrieval for search, RAG, and agents. Plans start at Free, with Builder at $20/month and Enterprise at $500/month.

pgvector

Vector similarity search inside Postgres for embeddings and relational data.

Agent Tools & Integrations

Pgvector adds vector search to Postgres with exact and approximate nearest-neighbor search. Plans run Free $0USDper user/month, Team $4USDper user/month, Enterprise $21USDper user/month.

Patronus AI

Evaluation and simulation for LLMs and agent workflows.

Agent Tools & Integrations

Patronus AI scores LLMs and agent workflows with evaluators, experiments, datasets, and logs. Plans start at free/month, then $25/month.