Xata - Postgres for agent scale, open source | xata.io

Since launching our private beta in May 2025, Xata has been running in production across companies of all sizes. Over the past months, we’ve pushed the platform hard, refining it based on real-world usage.

Today, we’re taking the next step: we’re open-sourcing the core of Xata.

Companies are hitting a new bottleneck. Generating code is now cheap, but proving it works in production is still hard. As agents become part of the development process, this challenge only intensifies. Teams need to test against production-like data, safely, in isolation, and at scale, without high infrastructure costs or the risk of impacting production.

We’re open-sourcing Xata under the Apache 2.0 license to unlock enterprise adoption, giving teams the ability to run it, modify it, and integrate it within their own infrastructure.

⭐ Star the repository so next time you’re waiting on data to copy, you think of Xata.

Databases were not designed for agents

Agentic workloads come with a completely different set of requirements for databases.

Agents don’t behave like users or traditional services. They run in parallel, they retry, they explore multiple paths, and they often operate without coordination. That changes what the database needs to provide.

Isolation becomes critical.

Agents shouldn’t share the same environment. One agent’s changes shouldn’t affect another’s outcome. Each agent needs its own database, a safe place to run, experiment, and fail.

Databases need to be ephemeral.

Most databases don’t need to live for days or weeks. They live for minutes, just long enough for an agent to complete its task and then scale to zero. The data isn’t deleted: only the compute disappears, while storage persists.

Databases need to be created instantly.

Agents can’t wait minutes for a database to be provisioned or copied. The database needs to be ready in under a second, on demand.

Real data becomes essential

Testing against empty or synthetic data breaks down quickly. Agents need access to production-like data, safely, without risking sensitive information or production systems.

Cost can’t scale linearly

If every agent requires its own database, the traditional model of duplicating storage simply doesn’t work. You need a way to create thousands of databases without multiplying storage costs.

The scale becomes clear once you do the math.

Imagine an AI platform with 100,000 users.

If each user runs just 10 agents per day, that’s 1 million agents. As agents become more autonomous, they won’t run one task at a time. They’ll branch, retry, and explore multiple paths in parallel.

If each of those agents needs multiple databases, even if only for a short period of time, you’re already operating at the scale of millions of databases.

That’s the world we’re building for.

100% Postgres compatible, by design.

From day one, we made a deliberate decision: build for agentic workloads while staying 100% vanilla Postgres, with no forks and no modifications.

We’ve learned this the hard way, because even small changes to Postgres can introduce unintended consequences, performance can degrade in subtle ways, compatibility with the broader ecosystem becomes more difficult, and over time maintaining a fork turns into a constant burden that slows everything down.

Instead of changing Postgres, we chose to build the platform around it.

Copy-on-write branching

At the core of the platform is copy-on-write branching.

Traditional cloning doesn’t work at scale. Copying a production database for every environment is slow, expensive, and quickly becomes impractical as data grows.

With copy-on-write, creating a branch takes the same time whether the source database is 50GB or 5TB. When a branch is created, it simply points to the same underlying data as the parent, no data is copied upfront. That’s why branching is instant.

As changes are made, only the modified data is written separately. Each branch keeps its own changes, while everything else continues to be shared, keeping storage usage low even with many branches.

To keep full compatibility with Postgres, we implemented copy-on-write branches at the storage layer, not inside Postgres itself. Under the hood, we use a distributed block storage system exposed over NVMe over Fabrics, giving each branch a high-performance disk with near-local latency.

What’s next

Open-sourcing Xata is just the beginning.

We’re building a new foundation for how databases work in the agentic era, where thousands of isolated, ephemeral databases are the norm, not the exception.

If you’re building an AI platform or adopting agents inside your organization, we’d love for you to try it, contribute, and help shape what comes next.

For a deeper dive, read the technical blog post.

Xata is now open source

Databases were not designed for agents

100% Postgres compatible, by design.

Copy-on-write branching

What’s next

Give every agentic workload its own Postgres branch

Share

Give every agentic workload its own Postgres branch

Related Posts

Xata: Postgres with data branching and PII anonymization

Introducing Xata OSS: Postgres platform with branching, now Apache 2.0