Langfuse
What is Langfuse?
Langfuse is an AI observability platform for AI product teams that combines tracing, prompt management, evaluations, and analytics dashboards in one workflow. It includes LLM Observability, Prompt Management, Evaluation, Playground, and Human Annotation, and integrates with OpenAI, LangChain, OpenTelemetry, and GitHub. Customers include Canva, Twilio, Adobe, and Khan Academy. Plans run Hobby free, Core $29/month, Pro $199/month, and Enterprise $2499/month.
Last verifiedHow we evaluate
At a glance
- Langfuse is best for AI teams who need tracing, prompt control, and evaluation in one workflow.
- Hobby Free; Core $29/mo; Pro $199/mo; Enterprise $2499/mo
- Yes — Langfuse is API-first and offers a public API plus SDK access for tracing, prompts, and evaluation scores.
What does Langfuse do?
Langfuse handles the LLM engineering loop by combining tracing, prompt management, evaluations, and analytics dashboards in one workflow. Teams can ingest traces through the SDKs or supported frameworks, inspect where prompts and model calls behave unexpectedly, and use the Playground and Experiments to compare changes before rollout. The result is a tighter debug-and-improve cycle for AI applications and agents. At scale, Langfuse is used by 2,300+ companies, 100,000+ engineers, and 19 of Fortune 50, with 10+ billion observations processed per month. It is API-first, with a public API plus SDK access for tracing, prompts, and evaluation scores, and it also supports self-hosting through Docker Compose, Kubernetes, and cloud Terraform guides. Customers named on the site include Canva, Twilio, Adobe, Khan Academy, Intuit, and Cisco.
Why use Langfuse?
- Open-source and self-hostable, so teams can keep observability data on their own infrastructure.
- API-first access lets developers automate tracing, prompts, and evaluation scores instead of working only in a UI.
- The platform combines tracing, prompt management, evaluations, and analytics, reducing tool sprawl in the LLM workflow.
- Scale signals are strong: 2,300+ companies, 100,000+ engineers, and 10+ billion observations per month.
- Enterprise tiers add audit logs, SCIM API, uptime SLA, and dedicated support for larger rollouts.
Who is Langfuse for?
- AI product teams who need to debug and improve application behavior across the full development loop.
- Platform engineers who want API-first tracing and evaluation data they can wire into existing systems.
- ML engineers who compare prompt and model changes before shipping them to users.
- Security-conscious teams who need self-hosting and compliance-oriented deployment options.
What are Langfuse's key features?
LLM Observability
Trace prompts, generations, and evaluation scores through the public API and SDKs, helping teams debug production LLM behavior at scale.
Prompt Management
Manage prompts centrally with the API-first platform and SDK access, so teams can version changes and ship updates without code churn.
Evaluation
Run evaluations on traced outputs and prompt changes using evaluation scores from the public API, making quality checks repeatable across releases.
Metrics
Track cost and latency alongside usage signals, giving teams a clear view of model spend and response time across 10+ billion observations/month.
Playground
Test prompts and model outputs in a controlled workspace before release, using the same tracing and SDK-backed data Langfuse captures in production.
Human Annotation
Review and label traces with annotation workflows, then connect feedback to evaluation scores and observability data for better model tuning.
Integrations
Connect Langfuse with OpenAI, LangChain, OpenTelemetry, and GitHub, plus SDKs in Python, TypeScript, Go, Java, and.NET.
Security & Compliance
Support regulated deployments with SOC2, ISO27001, and BAA availability for HIPAA, plus self-hosting for teams that need more control.
What does Langfuse integrate with?
- OpenAI
- LangChain
- Python
- TypeScript
- Go
- Java
- .NET
- Ruby
- PHP
- Swift
- Vercel AI SDK
- LiteLLM
- Pydantic AI
- Google ADK
- CrewAI
- LiveKit
- Anthropic
- Amazon Bedrock
- Azure OpenAI
- Mistral AI
- Google Gemini
- xAI
- vLLM
- Groq
- Claude Code
- OpenClaw
- Claude Agent SDK
- OpenWebUI
- Ollama
- OpenAI Agents SDK
What are Langfuse's use cases?
AI product debugging loop
AI product teams use Langfuse to trace failures across prompts, models, and user inputs, using LLM Observability to spot where an assistant drifts or breaks. They pair it with Evaluation to compare changes before shipping, so they can fix bad answers before customers see them.
Prompt testing for ML engineers
ML engineers use Langfuse to test prompt and model variants in a controlled workflow, using Playground to try ideas and Experiments to compare outcomes. With Metrics, they can choose the version that improves answer quality without increasing latency or cost.
API-first tracing for platform teams
Platform engineers use Langfuse to wire tracing and evaluation data into existing systems, relying on Integrations and the public API to keep observability inside their current stack. That gives them a shared view of production behavior without rebuilding internal tooling.
Compliance-ready deployment
Security-conscious teams use Langfuse to keep sensitive AI workflows under their own control, combining Security & Compliance with self-hosting options. They can centralize prompts, traces, and annotations while meeting internal deployment requirements and audit expectations.
How does Langfuse work?
- Connect your first app or model through the public API or an SDK, then start sending traces into LLM Observability so Langfuse can capture prompts, outputs, and latency from real requests.
- Organize and version prompts in Prompt Management, then use Playground to test edits before they reach users. Keep the best-performing variants ready for deployment.
- Run Evaluation and Experiments on live or sampled traces to compare model changes, score outputs, and identify regressions. Use Metrics to track quality, cost, and latency over time.
- Add Human Annotation for edge cases and review queues, then feed those labels back into your evaluation workflow. This helps teams turn subjective feedback into repeatable decisions.
- Connect Integrations to your stack, share results with teammates, and keep iterating from the same workspace. Security & Compliance and self-hosting support ongoing governance as usage grows.
How much does Langfuse cost?
Hobby
Free- All platform features (with limits)
- 50k units / month included
- 30 days data access
- 2 users
- Community support via GitHub
Core
$29/month- Everything in Hobby
- 100k units / month included, additional: $8/100k units. Lower with volume ( pricing calculator)
- 90 days data access
- Unlimited users
- In-app support
Pro
$199/month- Everything in Core
- 100k units / month included, additional: $8/100k units. Lower with volume ( pricing calculator)
- 3 years data access
- Data retention management
- Unlimited annotation queues
- High rate limits
- SOC2 & ISO27001 reports, BAA available (HIPAA)
- Prioritized in-app support
Enterprise
$2499/month- Everything in Pro + Teams
- 100k units / month included, additional: $8/100k units. Lower with volume ( pricing calculator)
- Audit Logs
- SCIM API
- Custom rate limits
- Uptime SLA
- Dedicated support engineer
Frequently asked questions
What is Langfuse?
Langfuse is an AI observability platform for AI product teams that combines tracing, prompt management, evaluations, and analytics dashboards in one workflow. It includes LLM Observability, Prompt Management, Evaluation, Playground, and Human Annotation, and integrates with OpenAI, LangChain, OpenTelemetry, and GitHub. Customers include Canva, Twilio, Adobe, and Khan Academy. Plans run Hobby free, Core $29/month, Pro $199/month, and Enterprise $2499/month.
How much does Langfuse cost? Is it free?
Langfuse has a free plan, with paid tiers including Core at $29/month, Pro at $199/month, Enterprise at $2499/month.
What is Langfuse used for? Who is it for?
Langfuse is used for LLM Observability, Prompt Management, and Evaluation. It's built for AI product teams, Platform engineers, and ML engineers.
Does Langfuse have an API and what does it integrate with?
Langfuse is API-first and offers a public API plus SDK access for tracing, prompts, and evaluation scores.
Editor's read
Check the usage-based unit allowance before rollout: Hobby includes 50k units/month, Core and Pro include 100k units/month with $8 per additional 100k units, and Enterprise keeps the same base allowance. If your trace volume or retention needs exceed those limits, the monthly bill and data access window change quickly.
