Back to vendors

Tabby

Also known as: Tabby, TabbyML, tabbyml.com, TabbyML/tabby

Visit site

Coding agentindependentVerified 2026-06-30

Open source, self hosted AI coding assistant that runs code completion, a codebase answer engine, and inline chat entirely on your own infrastructure using open models.

Tabby is an open source, self hosted AI coding assistant built by TabbyML for teams that want modern AI help without sending code to a third party. Written in Rust and distributed under a permissive license, it runs as a single self contained server via Docker, a standalone binary, or Homebrew, and needs no external database or cloud service. It loads an open coding model into a graphics processor and serves suggestions over a clean REST interface, running on consumer grade NVIDIA, Apple, and AMD hardware. The project is mature and widely used, with tens of thousands of stars and a steady release cadence.

Its core is fast code completion tuned for the fill in the middle pattern autocomplete relies on, with adaptive caching that returns suggestions in under a second. Around that, Tabby adds an Answer Engine that lets a developer ask questions about the codebase and get grounded explanations in the editor, an inline chat for real time back and forth, and a code browser. Context Providers pull in extra sources, documentation, configuration files, Git repositories, and external interfaces, so the assistant understands a project rather than a single file, and custom documentation can be added through the interface. A newer agent feature extends Tabby beyond suggestion toward multi step assistance.

Tabby is deliberately model agnostic. It is compatible with major open coding models such as StarCoder, CodeLlama, and Qwen coder variants, and teams can choose and combine a completion model, a chat model, and embeddings, or fine tune on their own code. It integrates as a plugin across VS Code, the JetBrains family, Vim, Neovim, and Emacs, and works with cloud editors. Because everything runs on infrastructure the team controls, a workstation, a dedicated server, or a Kubernetes cluster, code never leaves the boundary, which is the main reason regulated finance, healthcare, and defense teams adopt it.

For administration, Tabby ships an admin dashboard for user accounts, model management, and usage analytics, and supports directory based authentication for team deployments. It is genuinely free: no seats, subscriptions, or usage limits, with the only real cost being the hardware or cloud compute that runs the server, which can fall below a commercial per seat subscription once a team is large enough. The tradeoff is operational, since a team needs the capacity to run and maintain model serving. On an agent capability rubric Tabby scores as an assistant first tool, strong on private deployment and model choice, lighter on autonomous orchestration, testing, and computer use, which remain emerging or out of scope.

Vendor details

Canonical URL

https://tabbyml.com

Subcategory

Open source self hosted coding assistant (completion, answer engine, chat)

Funding status

TabbyML is the company behind the open source Tabby project, which is distributed under a permissive open source license. Specific venture funding is not disclosed in available sources. Tabby is a mature and actively maintained project, reporting more than thirty thousand stars, over a hundred contributors, and a long release history, with recent versions shipping through early 2026.

Company status

independent

Use cases & customers

Primary use cases

private self hosted code completioncodebase question answering in the IDEAI assistance for regulated and air gapped teamsmodel agnostic coding assistant on owned hardware

Target customers

privacy first and regulated teams (finance, healthcare, defense)organizations that run their own infrastructureteams wanting an open source alternative to cloud coding assistants

Deployment options

Open-source self-hosted (Docker, standalone binary, Homebrew)On-premisesPrivate cloud / KubernetesConsumer-grade GPUs (NVIDIA CUDA, Apple Metal, AMD ROCm)Fully offline, no external DBMS or cloud

Integrations

Runs as a self hosted server exposing a REST API for completion, chat, and answers, and connects to IDEs through official plugins for VS Code, the JetBrains family, Vim, Neovim, and Emacs, plus cloud editors. Context Providers ingest documentation, configuration files, Git repositories, and external interfaces to ground the assistant, and custom documentation can be added over the API. An admin dashboard manages users, models, and usage. Tabby focuses on completion, answers, and chat rather than acting on external tools, though a newer agent feature is expanding this.

In practice

You cannot send source code to a cloud service. Tabby runs entirely on your own hardware or private cloud, serving completion, chat, and answers with no external calls, so code never leaves your infrastructure.

You want to pick your own model. Tabby is model agnostic, running open coders like StarCoder, CodeLlama, or Qwen, and you can combine a completion model with a chat model or fine tune on your codebase.

Your team is large enough that per seat subscriptions add up. Tabby is free and open source with no seat or usage fees, so your only cost is the hardware or cloud compute running the server.

Sources & related URLs

Related / legacy domains

https://www.tabbyml.com/pricing https://github.com/TabbyML/tabby

Research sources

https://www.tabbyml.com/ https://github.com/TabbyML/tabby https://tabby.tabbyml.com/docs/welcome/ https://similarlabs.com/p/tabby-ai-coding-assistant https://aidevsetup.com/ide/tabbyml

Capability coverage

5.5 / 14 capabilities · 39%

Integrations & Tool CallingRuns a self hosted server with a REST interface and official plugins for VS Code, the JetBrains family, Vim, Neovim, and Emacs, and uses Context Providers to pull in documentation, configuration files, Git repositories, and external interfaces, a real integration surface oriented to ingesting context and serving suggestions rather than autonomously acting on external tools.	Partial
Workflow OrchestrationShips a newer agent feature that extends Tabby beyond single suggestion completion toward multi step assistance, a first class but still emerging capability, so genuine autonomous end to end orchestration is present in a light and early form rather than as a mature multi agent engine.	Partial
Knowledge Grounding & RAGGrounds suggestions and its Answer Engine in project context using Tree Sitter parsing, repository indexing, embeddings, and Context Providers that ingest documentation and configuration, a real retrieval and context capability, though grounding is retrieval and provider based rather than a deep codebase knowledge graph presented as the headline.	Partial
Human Oversight & GuardrailsKeeps the developer in control by design, since completions and inline chat edits are proposed for the developer to accept or reject, and team deployments add administrator managed accounts and access, real oversight of a propose and approve kind rather than a dedicated runtime guardrail enforcement engine.	Partial
Security, Identity & GovernanceRuns fully self hosted so code never leaves the team's infrastructure, adds directory based authentication for team deployments, and manages users and access through an admin panel, real data residency and access control, though without a public certification or single sign on and fine grained governance matrix.	Partial
Observability & AuditabilityProvides an admin dashboard to create user accounts, manage models, and view usage analytics, plus notifications for background jobs, real operational visibility into usage and administration, short of a comprehensive agent execution tracing, audit log, and analytics suite.	Partial
Memory & State PersistenceCan turn Answer Engine results into persistent, shareable pages and caches completions for speed, but these are saved answers and performance caching, not a persistent agent memory or checkpoint and rollback capability, which is not documented as first class.	Unable to verify
Deployment & Data ResidencyIs written in Rust as a self contained server that needs no external database or cloud service and installs through Docker, a binary, Homebrew, or Kubernetes, running on consumer grade NVIDIA, Apple, or AMD hardware entirely on the team's own infrastructure, a strong self host and data residency capability.	Full
Prebuilt Agents, Templates & PacksShips a single assistant covering completion, answers, and chat with configurable context providers, which is configuration rather than a library of prebuilt agents, templates, or packs to browse and adopt, or multi agent scaffolding, so this is not documented as first class.	Unable to verify
Triggers & Channel CoverageReaches developers through plugins for VS Code, the JetBrains family, Vim, Neovim, and Emacs, cloud editors, and a web interface, real multi editor channel coverage that is developer invoked during coding rather than driven by broad external event triggers.	Partial
Model Flexibility & RoutingIs model agnostic and compatible with major open coding models such as StarCoder, CodeLlama, and Qwen coders, letting teams choose and combine a completion model, a chat model, and embeddings or fine tune on their own code, real model flexibility, though it is self hosted model choice rather than a per task routing gateway across providers.	Partial
APIs, SDKs & MCP ExtensibilityExposes a clean REST interface for completion, chat, and answers and lets teams add custom documentation and context sources over the interface, and the whole project is open source for deeper customization, a solid API and extensibility surface, though a dedicated software development kit and Model Context Protocol server are not documented as first class.	Partial
Testing, Debugging & OptimizationFocuses on code completion, answers, and chat and does not ship a dedicated testing, debugging, or code quality engine, and no first class capability for generating tests, detecting bugs, or evaluating and optimizing code is documented.	Unable to verify
Browser & Computer UseServes completions, answers, and chat and does not document code execution, a terminal, a sandbox, or browser automation as part of the product, so first class computer use is not present in the core assistant, with any execution limited to the emerging agent feature.	Unable to verify

Recent platform changes

No recent material changes tracked yet.

View all changes for Tabby →

Pricing

Free and open source. No seats, subscriptions, or usage limits. The only cost is the hardware or cloud graphics compute that runs the self hosted server.

There is no product billing. Tabby is free and open source, so the only cost is the infrastructure that runs the self hosted server, whether a workstation, a dedicated server, or cloud graphics compute.

Public — exactLow variable costFree tier

Included quota

The full open source product with no usage limits: code completion, the Answer Engine, inline chat, a code browser, Context Providers, an admin dashboard, and directory based authentication, all self hosted. Teams supply and run their own models and hardware. There are no per seat or per call charges.

What is public

Public and clear: the product is free and open source with no seats, subscriptions, or usage limits, and cost is limited to infrastructure. Not detailed: any separate managed or hosted commercial offering.

Billing mechanics

No vendor billing. The open source server is free; teams run it on their own graphics hardware or cloud compute and pay only for that infrastructure.

Cost watchouts

The real cost is infrastructure and operations: a graphics processor to serve models, and the engineering time to deploy, secure, and maintain the server. Larger models need more capable hardware. There is no vendor bill, but self hosting shifts cost and responsibility to the team.

Variable cost rationale

Tabby serves self hosted open models, so there is no per token vendor billing. Cost is the infrastructure that runs the server, which is largely fixed for a given hardware footprint and scales in steps as a team adds capacity, not continuously with each request. Exposure is therefore low and predictable, dominated by hardware rather than usage metered fees.

Additional watchouts

Free software does not mean free to operate: budget for graphics hardware or cloud compute and for the engineering time to deploy, secure, and maintain the server and models.

Overage / add-ons

Not applicable. There are no usage meters or per call charges; capacity is bounded only by the hardware the team runs.

Sales call required

No — self-serve available

Free / trial

Tabby is entirely free and open source with no paid tier required. It is not a trial; the full product runs at no license cost on the team's own infrastructure.

Lowest paid plan

None required. The product is free and open source; there is no paid plan needed to use the full assistant.

Commercial notes

Tabby's economic pitch is that a self hosted, free, open source assistant beats per seat cloud subscriptions once a team is large enough, while keeping code private. The cost moves from a software subscription to hardware and operations, which suits teams that already run infrastructure and value control.

Key ambiguities

Because Tabby is free, cost planning is entirely about infrastructure: which model, how much graphics memory, and how many concurrent developers a node can serve. A pricing page exists, but the tool itself is free, so any managed or hosted offering, if present, is separate from the open source product.

Cancellation / refund

No subscription or contract exists. Tabby is free and open source; teams control their own infrastructure and can stop at any time.

Missing data

Whether TabbyML offers a separate paid managed or enterprise service, and any terms for it, are not detailed in available sources. The open source product itself is unambiguously free.

Verified 2026-06-30

Official pricing page

Data confidence: high