pgroll v0.13.0: Start using pgroll on existing databases with the new baseline feature

The new baseline feature in pgroll v0.13 makes it easy to start using pgroll on databases with existing schema and keep your schema histories clean.

Author

Andrew Farries

Date published

We recently released v0.13.0 of pgroll, our open-source schema migration tool for Postgres. As we've done for previous releases, we'll take a look at some of the new features that made it into the release and give some examples of their use. Past entries in the release blog series can be read here:

As always, the release is available on the Github v0.13.0 release page.

What is pgroll?

pgroll is a schema migration tool for Postgres. Designed for application developers working on applications that require frequent schema changes but also need to maintain zero downtime around those schema changes, pgroll takes a different approach compared to most other migration tools on the market. There are two aspects that characterize pgroll's approach to migrations:

  • Multi-version migrations: Making a schema change with pgroll results in two versions of the schema; the one before the change and the one after the change. This allows applications to select which version of the schema they want to work with and allows side-by-side rollout of applications that require the new schema changes with old applications that may be incompatible with it.
  • Lock-safe migrations: Migrations using pgroll are expressed declaratively, rather than using SQL directly. This allows pgroll to implement the steps required to perform the schema change in a safe manner, ensuring that any locks required on the affected objects are held for the shortest possible time.

Full documentation for pgroll is on the documentation section of this site.

The main new feature in thepgroll v0.13 release is the new pgroll baseline command; we'll take a look at how it works below and then describe the other changes that are part of this release.

Onboarding using baselines

A common question we've received from the community has been: "How do I start using pgroll with my existing database?" Until now, adopting pgroll for established projects with complex schemas required manually recreating your database structure through a series of pgroll migrations. With the most recent version of pgroll we're excited to introduce the new baseline command that makes this process much smoother.

Getting started with baseline

The pgroll baseline command captures your current database schema and uses it as a foundation for future pgroll migrations.

Suppose we have a database with existing schema, and we want to start using pgroll for our migrations. The first step is to create a baseline migration:

Here, 0001_initial_schema is the name of the baseline migration, and ./migrations is the directory where your migration files will be stored.

When you run this command, pgroll will:

  1. Take a snapshot of your current database schema
  2. Create a placeholder migration file under ./migrations/0001_initial_schema.yml
  3. Record this as your migration starting point in pgroll's internal state

Let's walk through an example.

Imagine you have an established application with dozens of tables that you've been managing with traditional migrations. To adopt pgroll:

From this point forward, all new schema changes can be managed through pgroll's zero-downtime migrations. In particular, the baseline migration in migrations/0001_initial_schema.yml will not be re-applied to the current database, only the subsequent migrations. When bootstrapping a new database with pgroll migrate all migrations including the baseline will be applied.

Re-baselining: compressing migration history

As your project evolves over time, you may find yourself with dozens or even hundreds of migrations. The baseline command can also be used to compress an existing migration history into a new baseline, giving your schema a new starting point and allowing old migration files to be removed from source control.

Consider a scenario where your team has been using pgroll for a year, accumulating 100s of migrations. To consolidate these into a clean baseline:

This re-baselining process creates a "reset point" in your migration history. All migrations before the new baseline are effectively captured within it, while migrations created after it remain individually trackable and applicable. Older migrations can be removed from source control.

The key benefits of re-baselining include:

  • Simplified history: Reduce dozens of historical migrations to a single baseline
  • Faster bootstrapping: New environments apply one migration instead of many
  • Cleaner development workflow: Start from a known-good state rather than replaying complex historical changes
  • Improved performance: Reduce the time and resources needed to apply migrations


Integration with existing commands

Other pgroll commands are now baseline-aware. For instance,

  • pgroll pull will now only pull migrations applied after the most recent baseline.
  • pgroll migrate will not consider migrations that precede the most recent baseline.

We've designed the baseline feature based on direct feedback from our community and we'd love to hear your views on how well it performs for your workflows.

Breaking change to migration naming

The latest version of pgroll makes one breaking change to the migration file format: the name field in a migration file is no longer supported. Migration files that specify a name field should be edited to remove the name:

The migration file name is now the source of truth for migration names.

As an alternative to editing migration files to remove the name, the pgroll pull command can be used to serialize all migrations in a schema history into the new format without the migration name field:

Contributing

pgroll is open source software; the code and discussions are open to everyone on our Github repository.

Features are driven in large part by input from the community so please get involved via issues, discussions and pull requests!

Related Posts

pgroll 0.10.0 update

pgroll v0.10 includes one breaking change, a new subcommand to convert SQL statements into pgroll operations and improvements to adding default values to existing tables.