Arize AI
Also known as: Arize AX, Phoenix
AI observability and evaluation platform that traces, evaluates, and monitors LLM applications and agents, with an open source core in Phoenix.
Arize AI is an observability and evaluation platform for AI applications and agents, used to trace what a system did, measure its quality, and monitor and improve it in production. The company was founded in January 2020 by Jason Lopatecki and Aparna Dhinakaran in Berkeley, California, and predates most LLM era tools, so it covers both traditional machine learning and generative AI from one platform. It comes in two forms: Phoenix, an open source project, and Arize AX, the managed enterprise platform built on the same standards.
Phoenix is the open source core, with tracing, evaluation, experimentation, and prompt iteration that teams can run locally, in a notebook, or self hosted with no feature gates. It is built on OpenInference and OpenTelemetry, the open standards Arize helped define, which means instrumentation is not locked to a proprietary trace format and data can also flow to backends like Jaeger, Prometheus, or Grafana. Phoenix integrates with OpenAI, Anthropic, LangChain, LangGraph, CrewAI, and LlamaIndex.
Arize AX adds managed infrastructure on top, backed by a purpose built datastore called adb that holds agent trajectories at petabyte scale with sub second queries and syncs to BigQuery, Databricks, or Snowflake. AX brings agent level tracing across prompts, tools, memory, and routing, online and offline evaluation using LLM as a judge for accuracy, tool calling, planning, and goal achievement, a prompt IDE, datasets and experiments, real time monitoring with alerts, and Alyx, an AI engineering agent that runs evals and debugs issues from the terminal or inside tools like Cursor and Claude Code.
Arize is among the most battle tested platforms in the category, reporting more than a trillion spans and fifty million evaluations a month across customers like DoorDash, Uber, Reddit, Instacart, PepsiCo, and Siemens, and it was selected by the U.S. Navy for a mission critical program. Pricing is public and self serve: AX Free and Phoenix are free, AX Pro is $50 a month with higher limits and longer retention, and AX Enterprise is custom and adds SOC2 Type II, HIPAA, dedicated support, an uptime SLA, and self hosting with data residency.
Vendor details
Canonical URL
https://arize.com
Category
Agent infrastructure
Subcategory
Observability and evaluation
Funding status
Founded January 2020 by Jason Lopatecki (CEO) and Aparna Dhinakaran (Chief Product Officer) in Berkeley, California. Has raised about $131M across rounds through Series C, and acquired Velvet in 2025. Remains independent.
Company status
independent
Use cases & customers
Primary use cases
Target customers
Deployment options
Integrations
OpenTelemetry and OpenInference native instrumentation with framework integrations for OpenAI, Anthropic, LangChain, LangGraph, CrewAI, and LlamaIndex. Python and JavaScript SDKs, trace export to backends like Jaeger, Prometheus, and Grafana, adb Data Fabric sync to BigQuery, Databricks, and Snowflake, and a CLI usable from Cursor and Claude Code.
In practice
Your agent works in demos but fails unpredictably in production. You instrument it with Arize, trace every prompt, tool call, and route, and find the exact step where it breaks.
You want to know whether a prompt change actually improved quality, not just feels better. You run LLM as a judge evaluations on curated datasets and on live traffic, and compare results before shipping.
You run both classic ML models and LLM agents and are tired of two monitoring stacks. Arize watches drift and performance on the models and traces and evaluates the agents in one place.
Sources & related URLs
Related / legacy domains
Research sources
Capability coverage
7.5 / 14 capabilities · 54%
| Integrations & Tool CallingBroad framework and provider integrations for instrumentation (OpenAI, Anthropic, LangChain, LangGraph, CrewAI, LlamaIndex) and the Alyx agent uses tools to debug, but Arize does not itself provide tool calling for end user agents. | Partial |
|---|---|
| Workflow OrchestrationOffers closed loop improvement workflows and experiments, but it observes and evaluates agents rather than orchestrating their runtime execution, sequencing, or branching. | Unable to verify |
| Knowledge Grounding & RAGTraces and evaluates RAG pipelines to debug retrieval quality, but does not provide a retrieval or knowledge grounding layer of its own. | Unable to verify |
| Human Oversight & GuardrailsStrong human in the loop quality tooling: human annotation, labeling queues, user feedback tracking, and evaluation gates for CI/CD. Not a runtime guardrail or policy layer that intercepts and blocks agent actions in production. | Partial |
| Security, Identity & GovernanceSSO via Google and GitHub on all tiers, enterprise SSO (Okta, AzureAD/EntraID) with enforcement, organization and space level RBAC, service accounts, audit logs, GDPR, SOC2 Type II, and HIPAA on the Enterprise tier, with US, EU, or CA data regions. | Full |
| Observability & AuditabilityCore product. Agent level tracing of prompts, tools, memory, and routing on OpenInference and OpenTelemetry, real time monitoring, custom metrics and dashboards, token and cost tracking, and audit logs. | Full |
| Memory & State PersistenceStores agent trajectories and context as telemetry at scale via adb with configurable retention, but this is observability storage and replay, not a runtime memory layer for agents. | Unable to verify |
| Deployment & Data ResidencySaaS managed cloud, self hosted on the Enterprise path and via open source Phoenix, US, EU, or CA data regions, plus a self hosting add on with data residency and multi region deployment. | Full |
| Prebuilt Agents, Templates & PacksShips a library of prebuilt evaluators (Evaluator Hub), an open source evals library, prebuilt dashboards and monitors, and cookbooks that speed time to value, but not prebuilt end user agents. | Partial |
| Triggers & Channel CoverageReal time monitors and alerts fire on metric, latency, and failure conditions, but Arize provides no agent invocation channels, schedulers, or conversational surfaces of its own. | Partial |
| Model Flexibility & RoutingAgnostic to the models and frameworks it observes and lets teams choose the evaluator or judge model, but it does not route production model traffic. | Partial |
| APIs, SDKs & MCP ExtensibilityPython and JavaScript SDKs, OpenTelemetry and OpenInference open standards, UI and SDK exports, file imports and exports, cloud DB sync, and a CLI usable from Cursor and Claude Code. | Full |
| Testing, Debugging & OptimizationCore product. Offline and online evaluation with LLM as a judge and code evals, datasets and experiments, agent path evals, a prompt IDE, prompt learning optimization, and a debugging playground. | Full |
| Browser & Computer UseNot applicable. Arize is an observability and evaluation platform and does not provide browser automation or computer use. | Unable to verify |
Pricing
From $50/mo · free tier + open source
Flat monthly plan with usage overage on trace spans and ingestion (GB)
Included quota
AX Free includes 25,000 trace spans and 1 GB ingestion per month with 15 day retention. AX Pro includes 50,000 spans and 10 GB per month with 30 day retention. Enterprise allowances are custom.
What is public
Arize publishes full self serve pricing: AX Free and Phoenix open source are free, AX Pro is $50 per month, and AX Enterprise is custom. Span and ingestion allowances and overage rates are listed per tier.
Billing mechanics
Tiers meter trace spans and data ingestion per month with fixed retention windows. AX Pro is a flat $50 monthly fee that includes 50,000 spans and 10 GB; beyond that, spans bill at $0.0008 each and ingestion at $3 per GB. Enterprise replaces fixed limits with negotiated allowances.
Cost watchouts
Span and ingestion overage on AX Pro can grow with high volume agents. The gap between $50 Pro and custom Enterprise is large with no middle tier.
Variable cost rationale
The $50 Pro plan includes generous span and ingestion allowances, but high traffic agents that exceed 50k spans or 10 GB a month accrue per span and per GB overage, and serious scale pushes teams to a custom Enterprise contract.
Additional watchouts
Offline evaluation on data older than the tier retention window (15 or 30 days) is not available until Enterprise. Compliance certifications and enterprise SSO are Enterprise only.
Overage / add-ons
On AX Pro, additional trace spans are $0.0008 each and additional ingestion is $3 per GB. Enterprise overage is negotiated.
Sales call required
Mixed (some tiers require a call)
Free / trial
AX Free (25k spans/mo, 1 GB, 15 day retention) and Phoenix open source, both free
Lowest paid plan
AX Pro: $50/mo
Commercial notes
Phoenix open source is a genuine free path with no feature gates for teams that self host. The paid path runs from $50 Pro to a custom Enterprise contract, with compliance (SOC2 Type II, HIPAA), audit logs, enterprise SSO, dedicated support, and adb Data Fabric reserved for Enterprise.
Key ambiguities
Total cost at scale depends on monthly span volume and ingestion, and on whether compliance needs force a move to the custom Enterprise tier.
Cancellation / refund
AX Free and AX Pro are self serve and month to month. Enterprise terms are contractual. Public cancellation and refund details are limited.
Support SLA / resale
Community support on Free, email support on Pro, dedicated support with a contractual uptime SLA on Enterprise.
Missing data
Enterprise pricing is custom and not listed; third party reports put it in the tens of thousands of dollars per year. Startup pricing is available by application.
Related vendors
- AgentOps — Agent observability and reliability platform with broad model and…
- Agno — High-performance agent runtime and framework (formerly Phidata) with…
- Apify — Cloud platform for web scraping and automation with 45,000+ prebuilt…
- Arcade — Authenticated tool calling platform and MCP runtime that handles…
- Braintrust — AI evaluation and observability platform with self-serve pricing,…
- CalypsoAI — Enterprise AI security platform that red teams, defends, and…