pgroll 0.7.0 update

Learn about the latest changes in pgroll in the 0.7.0 release as we continue developing it into a leading open-source schema migration tool for Postgres.

Written by

Andrew Farries

Published on

October 23, 2024

pgroll is Xata's open-source schema migration tool for Postgres, built to enable zero downtime, reversible schema migrations using the expand/contract pattern. pgroll is open-source under the Apache-2.0 license and developed in the open on GitHub.

Since the last pgroll 0.6.0 release in May 2024, we've introduced key updates aimed at solving common database migration challenges. Support for Postgres 17 ensures compatibility with the latest version, while improvements to backfill performance—now up to 80% faster—means less time waiting for migrations to start. Configurable batch sizes and delays offer developers more control over the migration process, optimizing performance and minimizing impact on their Postgres clusters. These enhancements make migrations faster, safer, and more efficient, especially for complex or large datasets.

In this post we'll have a look at the changes that went into the most recent release, v0.7.0.

#

What's new in pgroll v0.7.0

The previous version of pgroll, v0.6.0, was released in May 2024. It's seen steady development since May with 31 pull requests merged in that time. Happily, this release also includes work from six first time contributors! We're building pgroll in the open and we're always happy to see people getting involved on the issue tracker and through pull requests.

The full changelog for v0.7.0 can be found here:

The full changelog since May
The full changelog since May

In this post we'll focus on some of the highlights from the changelog.

Postgres 17 was released in September 2024. pgroll now supports Postgres 17, alongside our existing support for Postgres versions 14, 15, and 16.

pgroll v0.7.0 performs much better than previous versions when backfilling large tables. Backfills are central to the expand/contract pattern that pgroll uses to perform schema migrations. When a column is modified, for example to add a new constraint, pgroll creates a new column with the new constraint and backfills data from the old column to the new one on migration start.

Backfills are, unavoidably, long-running operations for tables with many records but pgroll v0.7.0 includes performance improvements that can reduce the duration of a backfill by as much as 80%.

The performance improvement was achieved by reducing the work done by the 'up' trigger that constitutes the 'inner loop' during a backfill. #389 has all the details if you're curious.

Another important backfill-related change that went into v0.7.0 is the ability to configure the batch size for backfills. In previous versions of pgroll backfills were batched, processing each batch of 1000 records in a separate transaction in order to minimize the duration for which locks were held on the rows in the table.

In pgroll v0.7.0 the same batched behaviour still applies but the batch size can be tuned through the --backfill-batch-size command line flag or the PGROLL_BACKFILL_BATCH_SIZE environment variable. For tables where row locks during migration start are less of a concern, for example append-only tables with few row UPDATEs, increasing the batch size can improve performance of the backfill by requiring fewer transactions and fewer round trips to the database.

Again with the backfills, pgroll v0.7.0 adds the ability to configure a delay between backfill batches. Previous versions of pgroll would always attempt to backfill batches as fast as possible, with no delay between each batch. pgroll v0.7.0 adds the ability to introduce a delay between batches by using the --backfill-batch-delay command line flag or the PGROLL_BACKFILL_BATCH_DELAY environment variable.

Controlling the rate of the backfill can be useful in scenarios where the database is under heavy load and you want to reduce the impact of the backfill so as not to disrupt other database clients. Downstream replicas can also become overloaded by a backfill and slowing the rate at which backfills are performed prevents replicas falling too far behind the primary.

As of v0.7.0, pgroll supports partial indexes as part of the create_index operation by specifying a predicate field:

{
  "name": "37_create_partial_index",
  "operations": [
    {
      "create_index": {
        "name": "idx_fruits_id_gt_10",
        "table": "fruits",
        "columns": ["id"],
        "predicate": "id > 10"
      }
    }
  ]
}

Partial indexes allow specifying a predicate as part of the operation to restrict the set of records included in the index. This is helpful to reduce the size of an index, given that the query planner will likely not use the index to find common values anyway.

pgroll is in active development as we work towards a v1.0 release of the project. The features added in v0.7.0 represent another step in this direction, improving performance, fixing bugs and adding new features.

October is Hacktoberfest and we've tagged some issues with the Hacktoberfest label. We've had a couple of contributions from new contributors via Hacktoberfest already this month and we're always open to more.

We'd love to hear from you if you're using pgroll in your projects. We welcome external contributions to the project, either in the form of issues on the issue tracker or by getting involved with development directly by opening pull requests.

Start free,
pay as you grow

Xata provides the best free plan in the industry. It is production ready by default and doesn't pause or cool-down. Take your time to build your business and upgrade when you're ready to scale.

Free plan includes
  • Single team member
  • 10 database branches
  • High availability
  • 15 GB data storage
Start freeExplore all plans
Free plan includes
  • Single team member
  • 10 database branches
  • High availability
  • 15 GB data storage

Sign up to our newsletter

By subscribing, you agree with Xata’s Terms of Service and Privacy Policy.

Copyright © 2024 Xatabase Inc.
All rights reserved.

Product

Feature requestsPricingStatusAI solutions