pgroll v0.10 includes one breaking change, a new subcommand to convert SQL statements into pgroll operations and improvements to adding default values to existing tables.
Written by
Andrew Farries, Noémi Ványi
Published on
February 28, 2025
We've just released v0.10
of pgroll
, our open-source schema migration tool for Postgres. As we've done for previous releases, we'll take a look at some of the headline features that made it into the release and look ahead to what's coming next. Past entries in the release blog series can be read here:
For the complete notes for this release see the Github v0.10.0 release page.
pgroll is a schema migration tool for Postgres. Designed for application developers working on applications that require frequent schema changes but also need to maintain zero downtime around those schema changes, pgroll
takes a different approach compared to most other migration tools on the market. There are two aspects that characterize pgroll
's approach to migrations:
pgroll
results in two versions of the schema; the one before the change and the one after the change - this allows applications to select which version of the schema they want to work with and allows side-by-side rollout of applications that require the new schema changes with old applications that may be incompatible with it.pgroll
are expressed declaratively, rather than using SQL directly. This allows pgroll
to implement the steps required to perform the schema change in a safe manner, ensuring that any locks required on the affected objects are held for the shortest possible time.Full documentation for pgroll is in the documentation section of this site.
pgroll
v0.10
includes a new subcommand, several improvements and one breaking change. Let's take a look at what went into the v0.10
release in more detail:
One of the first community requests was to use pgroll
with Object Relational Mappers (ORMs). We looked at several ways of integrating pgroll
into existing ORM workflows. As most ORMs can generate SQL migrations scripts, we decided to translate SQL statements into pgroll
migrations. The generated pgroll
migration can be saved to a file and you can run the pgroll
migration using the start
command as usual.
We added a new subcommand called convert
. Now pgroll
can translate SQL migration scripts generated by ORMs into pgroll migrations. It can read SQL statements from stdin
or from a specified file. It has one flag --name
to configure the name of the migration. If the flag is unset, the name is set to the current timestamp.
$ pgroll convert --help
Convert SQL statements to a pgroll migration. The command can read SQL statements from stdin or a file
Usage:
pgroll convert <path to file with migrations> [flags]
Flags:
-h, --help help for convert
-n, --name string Name of the migration (default "{current_timestamp}")
You can use pgroll
with any ORM that can generate raw SQL statements for its migrations. Let's look at a few examples.
Alembic is a database migration tool used in SQLAlchemy projects. You can generate SQL statements from its migrations in its offline mode. If you add
a --sql
flag to the upgrade
command of alembic
it prints the SQL statements to stdout. You can pipe this output into pgroll convert
:
$ alembic update {revision} --sql | pgroll convert --name {revision}
{
"name": "{revision}",
"operations": [
{
"create_table": {
"name": "employees",
"colunms:" [
{
"name": "name",
"type": "varchar(100)"
},
{
"name": "joined",
"type": "timestamp with time zone"
},
{
"name": "email",
"type": "varchar(254)"
}
]
}
}
]
}
Django is the go-to tool for Python web development. It can connect to several databases, including PostgreSQL. If your web application uses PostgreSQL storage, you can leverage the convert
subcommand.
After you've defined your migration in Python, you can use manage.py
to extract the SQL script from Django. The subcommand sqlmigrate
prints
the SQL statements to stdout.
Create a new model for an Employee
with a following simplified model:
class Employee():
name = models.CharField(max_length=100)
joined = models.DateTimeField()
email = models.EmailField()
Then run sqlmigrate
to generate the SQL statements and pipe the output into pgroll convert
:
manage.py sqlmigrate my_app 0000 | pgroll convert -name 0000_init
{
"name": "0000_init",
"operations": [
{
"create_table": {
"name": "employees",
"colunms:" [
{
"name": "name",
"type": "varchar(100)"
},
{
"name": "joined",
"type": "timestamp with time zone"
},
{
"name": "email",
"type": "varchar(254)"
}
]
}
}
]
}
Drizzle is a popular ORM for Typescript projects. You can extract SQL statements using its generate
command.
The following example schema in Drizzle can be translated using pgroll
:
import { pgTable, timestamp, varchar } from 'drizzle-orm/pg-core';
export const employees = pgTable('employees', {
name: varchar({ length: 100 }),
joined: timestamp({ withTimezone: true }),
email: varchar({ length: 254 })
});
Run pgroll convert
to get the appropriate pgroll migrations.
drizzle-kit generate --dialect postgresql --schema=./src/schema.ts --name=init
pgroll convert 0000_init.sql --name 0000_init
{
"name": "0000_init",
"operations": [
"create_table": {
"name": "employees",
"colunms:" [
{
"name": "name",
"type": "varchar(100)"
},
{
"name": "joined",
"type": "timestamp with time zone"
},
{
"name": "email",
"type": "varchar(254)"
}
]
}
]
}
}
We are releasing this feature so we can get your feedback early. The functionality still has some limitations. The generated pgroll migrations must be edited to provide up
and down
data migrations manually.
Furthermore, the SQL statements are not aggregated into single pgroll operations. Some ORMs add unique constraints in a different statement when they are creating a table with a unique column. This leads to more pgroll operations than necessary. You can resolve this manually by removing the unique constraint operation from the pgroll migration, and add it to the list of constraints
of create_table
.
{
"create_table": {
"name": "employees",
"columns": [
{
"name": "email",
"varchar(254)"
}
]
}
},
{
"create_constraint": {
"name": "my_unique_email",
"type": "unique",
"columns": ["email"]
"up": {
"email": "TODO"
}
}
}
can be simplified to
{
"create_table": {
"name": "employees",
"columns": [
{
"name": "email",
"varchar(254)"
}
],
"constraints": [
{
"name": "my_unique_email",
"type": "unique",
"columns": ["email"]
}
]
}
}
Also, the same applies to some cases when the ORM does the backfilling for new columns. For example, when you add a new column with a default value, an ORM produces these SQL statements:
ALTER TABLE "adults" ADD COLUMN "age" smallint DEFAULT 18 NULL CHECK ("age" >= 18);
ALTER TABLE "adults" ALTER COLUMN "level" DROP DEFAULT;
UPDATE adults SET age = 18;
ALTER TABLE adults ALTER COLUMN age SET NOT NULL;
UPDATE "adults" SET "age" = 18 WHERE "age" IS NULL;
SET CONSTRAINTS ALL IMMEDIATE;
ALTER TABLE "adults" ALTER COLUMN "age" SET NOT NULL;
Resulting in the following pgroll migration using the convert
subcommand:
{
"add_column": {
"column": {
"check": {
"constraint": "age >= 18",
"name": "age_check"
},
"default": "18",
"name": "age",
"nullable": true,
"type": "smallint"
},
"table": "adults",
"up": "TODO: Implement SQL data migration"
}
},
{
"alter_column": {
"column": "age",
"default": null,
"table": "adults",
"down": "TODO: Implement SQL data migration",
"up": "TODO: Implement SQL data migration"
}
},
{
"sql": {
"up": "update adults set age = 18"
}
},
{
"alter_column": {
"column": "age",
"nullable": false,
"table": "adults",
"up": "TODO: Implement SQL data migration"
"down": "TODO: Implement SQL data migration",
}
},
{
"sql": {
"up": "UPDATE \"adults\" SET \"age\" = 18 WHERE \"age\" IS NULL"
}
},
{
"sql": {
"up": "SET CONSTRAINTS ALL IMMEDIATE"
}
},
{
"alter_column": {
"column": "age",
"nullable": false,
"table": "adults",
"up": "TODO: Implement SQL data migration"
"down": "TODO: Implement SQL data migration",
}
}
This can be written as a single pgroll migration:
{
"add_column": {
"column": {
"check": {
"constraint": "age \u003e= 0",
"name": "age_check"
},
"default": "18",
"name": "age",
"nullable": true,
"type": "smallint"
},
"table": "adults",
"up": "18"
}
},
Also, SQL migration scripts usually start with BEGIN
and COMMIT
because the ORM runs the DDLs in a single transaction. These statements show up in the list of operations in the generated migrations. These operations are safe to be deleted from the list.
pgroll
v0.10
includes one breaking change to how columns are configured when creating new indexes. Previously, the columns
list of create_index
operations expected a list of strings:
{
"name": "10_create_index",
"operations": [
{
"create_index": {
"table": "employees",
"columns": ["salary"]
}
}
]
}
Now you need to specify columns as map keys:
{
"name": "10_create_index",
"operations": [
{
"create_index": {
"table": "employees",
"columns": {
"salary": {}
}
}
}
]
}
pgroll
is now using maps so create_index
can support several new options to configure the index elements per column.
Using the sort
attribute, you can change the sort order from ascending, which is the default, to descending.
{
"name": "10_create_index",
"operations": [
{
"create_index": {
"table": "employees",
"columns": [
"salary": {
"sort": "desc"
}
]
}
}
]
}
Another option to change sort ordering of the index is nulls
. You can specify if the null values come before (first
) or after (last
) non-null values.
{
"name": "10_create_index",
"operations": [
{
"create_index": {
"table": "employees",
"columns": [
"position": {
"nulls": "last"
}
]
}
}
]
}
The attribute collation
changes the collation used for the index.
{
"name": "10_create_index",
"operations": [
{
"create_index": {
"table": "employees",
"columns": [
"bio": {
"collate": "fr_FR"
}
]
}
}
]
}
Now you can also change the operator class for the indexed column. Also, you can pass configuration paramaters for the operator class.
{
"name": "10_create_index",
"operations": [
{
"create_index": {
"table": "employees",
"columns": [
"bio": {
"nulls": "first"
"opclass": {
"name": "opc_name",
"params": {
"param1": "val1"
}
}
}
]
}
}
]
}
pgroll
v0.10
makes improvements to how columns with DEFAULT
values are added to existing tables to avoid long-lived ACCESS_EXCLUSIVE
locks when adding columns with volatile DEFAULT
values.
In Postgres, adding a new column with a DEFAULT
value is an operation that comes with a pitfall: Postgres has different behaviours depending on the volatility of the DEFAULT
. Consider this ALTER TABLE
statement:
ALTER TABLE items ADD COLUMN value INTEGER DEFAULT 100;
The DEFAULT
expression here is non-volatile, meaning that Postgres can apply an optimization when adding the column - rather than rewriting the table to populate every row with the new DEFAULT
, Postgres applies the DEFAULT
lazily. The entry in the system catalog pg_attribute
for the new column is updated to record that there is a default value for the column that should be populated on-the-fly whenever the column is read.
By contrast, consider this ALTER TABLE
operation:
ALTER TABLE items ADD COLUMN value INTEGER DEFAULT (random()*100)::integer;
This time, the DEFAULT
expression is volatile, meaning that Postgres cannot apply the same optimization as before. Instead, Postgres must rewrite the table to populate the new field with its DEFAULT
in every row. The table rewrite requires an ACCESS_EXCLUSIVE
lock for the duration of the rewrite.
Previously, pgroll
did not differentiate between these two cases when adding a new column with a DEFAULT
value, meaning that adding a new column with a volatile DEFAULT
would take an ACCESS_EXCLUSIVE
lock, locking out reads and writes from the affected table for the duration of the table rewrite.
Starting with pgroll
v0.10
, pgroll
now detects whether the DEFAULT
value for a new column is volatile or non-volatile, and applies the DEFAULT
differently in each case:
DEFAULT
values, pgroll
applies the DEFAULT
to the new column directly as before. The operation is a catalog-only update and requires only a very brief ACCESS_EXCLUSIVE
lock.DEFAULT
, pgroll
creates the column without a DEFAULT
, avoiding the table rewrite. The new column is then backfilled with the DEFAULT
value using pgroll
's backfill mechanism. The DEFAULT
is added to the column on migration completion.A migration to add a new column with a volatile DEFAULT
is written:
{
"name": "29_add_column_with_default",
"operations": [
{
"add_column": {
"table": "items",
"up": "(random()*100)::integer",
"column": {
"name": "value",
"type": "integer",
"default": "(random()*100)::integer"
}
}
}
]
}
The up
expression used as the backfill must be the same as the default
expression for the new column. pgroll
will backfill the new column with the new DEFAULT
value, and then apply the DEFAULT
to the column on completion of the migration.
This means that adding columns with volatile DEFAULT
values no longer requires a potentially long-lived ACCESS_EXCLUSIVE
lock, making such operations safe to perform on tables with large row counts.
pgroll
is open-source; check out the repository to get involved.
Many of the features that have gone into this and previous releases have been directly influenced by how users tell us they are using the tool and where it's currently lacking. We welcome any feedback in issues, or contributions via pull requests!
Xata provides the best free plan in the industry. It is production ready by default and doesn't pause or cool-down. Take your time to build your business and upgrade when you're ready to scale.
Copyright © 2025 Xatabase Inc.
All rights reserved.