Making Postgres scale to zero with CNPG
How we built activity-aware Postgres clusters that hibernate automatically and save resources
Author
Esther Minano SanzDate published
You’ve probably come across the idea of scaling to zero. It’s been getting more popular in the cloud world, and the idea is simple: when a service or application isn’t being used, you scale it down to zero resources so you’re not paying for idle capacity.
For most workloads, this is straightforward. But with databases, it gets trickier. Databases are the backbone of your infrastructure, and shutting them off completely feels risky. That said, there are plenty of situations where it makes sense, and where the benefits outweigh the concerns.
In this post, we’ll walk you through why scaling a database to zero can actually be a good idea, and share how we made it work for Postgres CNPG using our new open source CNPG-I Scale-to-Zero Plugin.
Background
To scale a database to zero, you need to separate storage and compute layers. This allows your data to remain safe while compute resources are paused and ensures they can restart quickly and access the data smoothly.
At Xata, we use CNPG as our compute layer in Kubernetes, enabling managed Postgres clusters that can scale up or down and include replica sets for reliability. While CNPG allows you to hibernate a Postgres cluster, it doesn’t automatically scale it to zero when inactive, requiring manual intervention or a custom automation process.
When we set out to implement the scale to zero feature, with hibernation natively available, our main challenge was monitoring Postgres activity across multiple managed clusters. We considered a few approaches:
- Track activity in a separate database and trigger hibernation after inactivity.
- Integrate monitoring into CNPG metrics and use a background task to hibernate inactive clusters.
- Track activity within CNPG, letting each cluster manage its own hibernation and remain the single source of truth.
We chose option 3. It keeps everything clean, centralized and simple. During research, we discovered a new CNPG interface designed to make it easier to integrate custom behavior into clusters.
With the interface in hand, it was time to see if our plan could actually work. Off to the lab! 🧪
Implementing the plugin
The CNPG interface is still fairly new and marked as experimental, so there weren’t many examples to learn from or to get a hands-on understanding of what it could do. It provides a protocol that needs to be implemented, with a set of capabilities to choose from.
For our purposes, the lifecycle capability was all we needed since we weren’t dealing with backups or restores. This capability lets you intercept cluster lifecycle events and adjust them to fit your requirements.
To let each cluster manage its own activity tracking and hibernation, we designed the plugin to inject a sidecar when a cluster is created, using the CREATE
capability.
- The sidecar runs as a separate container within the cluster pod, simplifying RBAC permissions.
- It is injected into to all cluster instances, but only the primary actively tracks activity and triggers hibernation.
By implementing the EVALUATE
capability, the plugin takes advantage of CNPG rolling updates. This ensures that if a switchover or restart occurs and the primary changes, the sidecar on the new primary automatically becomes active and resumes activity tracking. ✨
We initially tried to simplify things by running the sidecar only on the primary instance. This helped keep resource usage low, especially for clusters with multiple replicas, but it meant that switchovers weren’t supported. The spec between the current and the target primary had to match, and without the sidecar on all instances, the switchover caused the cluster to enter a crash loop. We found that using the restart
update method could avoid this, but it results in downtime, which we couldn’t accept.
⚡Resource usage
The sidecar is very lightweight. Active containers use less than 15MiB memory and 0.05 CPU, while passive replicas use even less.
What the sidecar does
The sidecar runs a background process that monitors cluster activity every minute.
Here’s how it works:
- It checks if scale to zero is enabled for the cluster.
- If it is, it checks current open connections, both idle and active.
- If no connections are open, it compares the last active timestamp against the configured inactivity period.
- Once the period has passed, it checks for ongoing backups and temporarily pauses scheduled backups.
- If no backups are running, it triggers cluster hibernation.
💾 Why pause backups?
Backups require the cluster pod to be active. Pausing them prevents errors and avoids corrupted backup files, ensuring the latest backup is reliable for recovery. Ideally, CNPG would handle this automatically, but until that integration exists, the sidecar ensures smooth operation.
The plugin is configured via cluster annotations:
Defaults are disabled and 30 minutes if annotations are missing.
What the sidecar doesn’t do
The sidecar takes care of automatic hibernation, but it can’t handle reactivation because the cluster is no longer running once it’s hibernated. Reactivating a hibernated cluster requires either manually updating its annotations to disable hibernation or setting up a custom process that triggers it when a new connection is made.
Although the plugin itself doesn’t cover reactivation, at Xata we provide automatic reactivation on connection requests, so you don’t need to build and maintain that process yourself.
Use cases
Scaling to zero can spark a lot of debate, with strong arguments both for and against it. There isn’t a single right answer; it ultimately depends on your specific needs. Some situations where scaling your database to zero can be particularly beneficial include:
Ephemeral branches 💻
Think of feature branches and ephemeral environments used during active development or AI-native workflows. They do not need to run all the time, but they should be ready as soon as you return to your work. With scale to zero, you can hibernate these branches while they are idle and bring them back instantly when you start coding, testing prompts, or experimenting with AI models.
Staging environments 🏗️
Staging is supposed to mirror production, but most of the time it just sits there unused. With scale to zero, staging can sleep when you don’t need it and wake up for deployment tests. That way you get production-like data and behavior without the constant cost.
Testing environments 🧪
Automated test databases are another perfect match. Instead of running all the time, they can pause between test runs and wake up only for your CI/CD pipelines. You get the benefit of comprehensive testing without paying for idle clusters.
Conclusion
For us at Xata, scale to zero felt like a natural fit. The use cases we walked through are exactly where our users (and our own teams) benefit the most. That’s why we built it as an opt-in feature: it helps cut costs, adds flexibility to the development pipeline and makes it easier to spin up production-like environments without keeping everything running 24/7.
If you’d like to give it a try, check out the CNPG-I scale to zero plugin or request access to our private beta. And if you’re already experimenting with scale to zero, we’d love to hear how it’s working for you. 🚀
Come chat with us on Discord or follow along on Twitter/X and Bluesky! 💜
Related Posts
Xata: Postgres with data branching and PII anonymization
Relaunching Xata as "Postgres at scale". A Postgres platform with Copy-on-Write branching, data masking, and separation of storage from compute.
PostgreSQL Branching: Xata vs. Neon vs. Supabase - Part 1
Take a closer look at PostgreSQL branching by comparing Xata, Neon, and Supabase on architecture, features, and real-world costs.
PostgreSQL Branching: Xata vs. Neon vs. Supabase - Part 2
Take a closer look at PostgreSQL branching by comparing Xata, Neon, and Supabase on architecture, features, and real-world costs.