
Langfuse
Open source LLM engineering platform for observability and evaluation
Langfuse is the open-source LLM engineering platform that helps teams collaboratively develop, monitor, and evaluate AI applications. Track every LLM call, manage prompts with versioning, run evaluations, and debug issues in production. Self-host for complete data control or use the managed cloud service.

Why Langfuse?
Building LLM applications is easy. Running them in production is hard. Without observability, you're flying blind—you can't see which prompts are failing, how much you're spending, or why users are unhappy. Debugging production issues means adding print statements and redeploying. Teams need proper tooling to iterate on prompts, track costs, and ensure quality.
How It Works
Langfuse instruments your LLM application to capture every trace—model calls, retrieval operations, agent actions. View traces in a clean UI, analyze latency and costs, and identify problematic patterns. Manage prompts with versioning, run A/B tests, and evaluate outputs with LLM-as-judge or human labeling. All data stays on your infrastructure when self-hosted.
What Is Langfuse?
Langfuse is an open-source observability and evaluation platform for LLM applications. It provides tracing, prompt management, evaluations, datasets, and a playground for testing. Integrates with LangChain, LlamaIndex, OpenAI, Anthropic, and all major frameworks.
Key Benefits
Why teams choose Langfuse
Full Observability
Trace every LLM call, embedding, and retrieval operation. See exactly what happened.
Prompt Management
Version and collaborate on prompts. Roll back changes, run A/B tests.
Cost Tracking
Monitor token usage and costs across models. Set budgets and alerts.
Evaluations
LLM-as-judge, user feedback, and manual labeling. Continuous quality monitoring.
Drop-In Integration
One-line SDK integration for Python and TypeScript. Works with all frameworks.
Self-Hosted
Run on your infrastructure. Complete data ownership and compliance.
Features
Everything you need to build with Langfuse
Tracing
Detailed traces for LLM calls, spans, and nested operations.
Prompt Playground
Test prompts interactively before deploying to production.
Datasets
Create test sets and benchmarks for regression testing.
User Feedback
Collect thumbs up/down and detailed feedback from end users.
API & Webhooks
OpenAPI with typed SDKs. Integrate with your existing tools.
Team Collaboration
Multi-user with project-based access control.
Use Cases
What you can build with Langfuse
Technology Stack
Ready to deploy Langfuse?
Get started in minutes. Deploy on your own infrastructure at actual cloud cost. No markup, no vendor lock-in.