Back to vendors

Arize AI

Also known as: Arize AX, Phoenix

Visit site

Agent infrastructureindependentVerified 2026-06-30

AI observability and evaluation platform that traces, evaluates, and monitors LLM applications and agents, with an open source core in Phoenix.

Arize AI is an observability and evaluation platform for AI applications and agents, used to trace what a system did, measure its quality, and monitor and improve it in production. The company was founded in January 2020 by Jason Lopatecki and Aparna Dhinakaran in Berkeley, California, and predates most LLM era tools, so it covers both traditional machine learning and generative AI from one platform. It comes in two forms: Phoenix, an open source project, and Arize AX, the managed enterprise platform built on the same standards.

Phoenix is the open source core, with tracing, evaluation, experimentation, and prompt iteration that teams can run locally, in a notebook, or self hosted with no feature gates. It is built on OpenInference and OpenTelemetry, the open standards Arize helped define, which means instrumentation is not locked to a proprietary trace format and data can also flow to backends like Jaeger, Prometheus, or Grafana. Phoenix integrates with OpenAI, Anthropic, LangChain, LangGraph, CrewAI, and LlamaIndex.

Arize AX adds managed infrastructure on top, backed by a purpose built datastore called adb that holds agent trajectories at petabyte scale with sub second queries and syncs to BigQuery, Databricks, or Snowflake. AX brings agent level tracing across prompts, tools, memory, and routing, online and offline evaluation using LLM as a judge for accuracy, tool calling, planning, and goal achievement, a prompt IDE, datasets and experiments, real time monitoring with alerts, and Alyx, an AI engineering agent that runs evals and debugs issues from the terminal or inside tools like Cursor and Claude Code.

Arize is among the most battle tested platforms in the category, reporting more than a trillion spans and fifty million evaluations a month across customers like DoorDash, Uber, Reddit, Instacart, PepsiCo, and Siemens, and it was selected by the U.S. Navy for a mission critical program. Pricing is public and self serve: AX Free and Phoenix are free, AX Pro is $50 a month with higher limits and longer retention, and AX Enterprise is custom and adds SOC2 Type II, HIPAA, dedicated support, an uptime SLA, and self hosting with data residency.

Vendor details

Canonical URL

https://arize.com

Subcategory

Observability and evaluation

Funding status

Founded January 2020 by Jason Lopatecki (CEO) and Aparna Dhinakaran (Chief Product Officer) in Berkeley, California. Has raised about $131M across rounds through Series C, and acquired Velvet in 2025. Remains independent.

Company status

independent

Use cases & customers

Primary use cases

agent observabilityLLM evaluationagent tracingprompt optimizationproduction monitoring

Target customers

developersenterprise

Deployment options

SaaSself-hostedon-prem

Integrations

OpenTelemetry and OpenInference native instrumentation with framework integrations for OpenAI, Anthropic, LangChain, LangGraph, CrewAI, and LlamaIndex. Python and JavaScript SDKs, trace export to backends like Jaeger, Prometheus, and Grafana, adb Data Fabric sync to BigQuery, Databricks, and Snowflake, and a CLI usable from Cursor and Claude Code.

In practice

Your agent works in demos but fails unpredictably in production. You instrument it with Arize, trace every prompt, tool call, and route, and find the exact step where it breaks.

You want to know whether a prompt change actually improved quality, not just feels better. You run LLM as a judge evaluations on curated datasets and on live traffic, and compare results before shipping.

You run both classic ML models and LLM agents and are tired of two monitoring stacks. Arize watches drift and performance on the models and traces and evaluates the agents in one place.

Sources & related URLs

Related / legacy domains

https://arize.com/phoenix https://arize.com/docs/ax https://github.com/Arize-ai/phoenix

Research sources

https://arize.com https://arize.com/pricing/ https://arize.com/phoenix

Capability coverage

7.5 / 14 capabilities · 54%

Integrations & Tool CallingBroad framework and provider integrations for instrumentation (OpenAI, Anthropic, LangChain, LangGraph, CrewAI, LlamaIndex) and the Alyx agent uses tools to debug, but Arize does not itself provide tool calling for end user agents.	Partial
Workflow OrchestrationOffers closed loop improvement workflows and experiments, but it observes and evaluates agents rather than orchestrating their runtime execution, sequencing, or branching.	Unable to verify
Knowledge Grounding & RAGTraces and evaluates RAG pipelines to debug retrieval quality, but does not provide a retrieval or knowledge grounding layer of its own.	Unable to verify
Human Oversight & GuardrailsStrong human in the loop quality tooling: human annotation, labeling queues, user feedback tracking, and evaluation gates for CI/CD. Not a runtime guardrail or policy layer that intercepts and blocks agent actions in production.	Partial
Security, Identity & GovernanceSSO via Google and GitHub on all tiers, enterprise SSO (Okta, AzureAD/EntraID) with enforcement, organization and space level RBAC, service accounts, audit logs, GDPR, SOC2 Type II, and HIPAA on the Enterprise tier, with US, EU, or CA data regions.	Full
Observability & AuditabilityCore product. Agent level tracing of prompts, tools, memory, and routing on OpenInference and OpenTelemetry, real time monitoring, custom metrics and dashboards, token and cost tracking, and audit logs.	Full
Memory & State PersistenceStores agent trajectories and context as telemetry at scale via adb with configurable retention, but this is observability storage and replay, not a runtime memory layer for agents.	Unable to verify
Deployment & Data ResidencySaaS managed cloud, self hosted on the Enterprise path and via open source Phoenix, US, EU, or CA data regions, plus a self hosting add on with data residency and multi region deployment.	Full
Prebuilt Agents, Templates & PacksShips a library of prebuilt evaluators (Evaluator Hub), an open source evals library, prebuilt dashboards and monitors, and cookbooks that speed time to value, but not prebuilt end user agents.	Partial
Triggers & Channel CoverageReal time monitors and alerts fire on metric, latency, and failure conditions, but Arize provides no agent invocation channels, schedulers, or conversational surfaces of its own.	Partial
Model Flexibility & RoutingAgnostic to the models and frameworks it observes and lets teams choose the evaluator or judge model, but it does not route production model traffic.	Partial
APIs, SDKs & MCP ExtensibilityPython and JavaScript SDKs, OpenTelemetry and OpenInference open standards, UI and SDK exports, file imports and exports, cloud DB sync, and a CLI usable from Cursor and Claude Code.	Full
Testing, Debugging & OptimizationCore product. Offline and online evaluation with LLM as a judge and code evals, datasets and experiments, agent path evals, a prompt IDE, prompt learning optimization, and a debugging playground.	Full
Browser & Computer UseNot applicable. Arize is an observability and evaluation platform and does not provide browser automation or computer use.	Unable to verify

Recent platform changes

No recent material changes tracked yet.

View all changes for Arize AI →

Pricing

From $50/mo · free tier + open source

Flat monthly plan with usage overage on trace spans and ingestion (GB)

Public — exactMedium variable costFree tier

Included quota

AX Free includes 25,000 trace spans and 1 GB ingestion per month with 15 day retention. AX Pro includes 50,000 spans and 10 GB per month with 30 day retention. Enterprise allowances are custom.

What is public

Arize publishes full self serve pricing: AX Free and Phoenix open source are free, AX Pro is $50 per month, and AX Enterprise is custom. Span and ingestion allowances and overage rates are listed per tier.

Billing mechanics

Tiers meter trace spans and data ingestion per month with fixed retention windows. AX Pro is a flat $50 monthly fee that includes 50,000 spans and 10 GB; beyond that, spans bill at $0.0008 each and ingestion at $3 per GB. Enterprise replaces fixed limits with negotiated allowances.

Cost watchouts

Span and ingestion overage on AX Pro can grow with high volume agents. The gap between $50 Pro and custom Enterprise is large with no middle tier.

Variable cost rationale

The $50 Pro plan includes generous span and ingestion allowances, but high traffic agents that exceed 50k spans or 10 GB a month accrue per span and per GB overage, and serious scale pushes teams to a custom Enterprise contract.

Additional watchouts

Offline evaluation on data older than the tier retention window (15 or 30 days) is not available until Enterprise. Compliance certifications and enterprise SSO are Enterprise only.

Overage / add-ons

On AX Pro, additional trace spans are $0.0008 each and additional ingestion is $3 per GB. Enterprise overage is negotiated.

Sales call required

Mixed (some tiers require a call)

Free / trial

AX Free (25k spans/mo, 1 GB, 15 day retention) and Phoenix open source, both free

Lowest paid plan

AX Pro: $50/mo

Commercial notes

Phoenix open source is a genuine free path with no feature gates for teams that self host. The paid path runs from $50 Pro to a custom Enterprise contract, with compliance (SOC2 Type II, HIPAA), audit logs, enterprise SSO, dedicated support, and adb Data Fabric reserved for Enterprise.

Key ambiguities

Total cost at scale depends on monthly span volume and ingestion, and on whether compliance needs force a move to the custom Enterprise tier.

Cancellation / refund

AX Free and AX Pro are self serve and month to month. Enterprise terms are contractual. Public cancellation and refund details are limited.

Support SLA / resale

Community support on Free, email support on Pro, dedicated support with a contractual uptime SLA on Enterprise.

Missing data

Enterprise pricing is custom and not listed; third party reports put it in the tens of thousands of dollars per year. Startup pricing is available by application.

Verified 2026-06-30

Official pricing page

Data confidence: high