On the performance impact of REPLICA IDENTITY FULL in Postgres

Analyzing the performance impact of REPLICA IDENTITY FULL by looking at the PostgreSQL source code and performing benchmarks.

Written by

Tudor Golubenco

Published on

June 7, 2023

At Xata, we've come to rely on the Postgres logical replication events quite a bit. Why? Because they provide us with a dependable stream of events about every single change that occurs within the database. We use them for syncing the data to Elasticsearch, web-hooks, deleting attachments after the parent row was deleted, and more. One of the configuration parameters for logical replication is the so called “replica identity”:

You’d typically use the primary key or an index as the “replica identity” and avoid setting the “replica identity” to full. The reason is this quote from the Postgres docs (emphasis mine):

That sounds quite discouraging, however, there are some good reasons for which you might still want to consider enabling it (see next section), so it’s worth digging a bit deeper and analyzing what exactly happens when you enable it and how it affects performance. This is what this blog post is about.

Replica identity full meme

Okay, but why enable REPLICA IDENTITY FULL?

In our case there were two things that convinced us that we might want to consider REPLICA IDENTITY FULL:

  • Before and After Values: On UPDATEs and DELETEs, Postgres only includes the old values of the “identity columns”. So if you want to get the values after the change as well as the values before the change, setting replica identity to “full” tricks Postgres into passing the old values as well.
  • TOAST values: UPDATE WAL events don’t include TOASTed columns, unless that column has changed in the current operation. In practical terms, this means that if you have column values over 8KB, they are not included in the update events. By setting the replica identity to FULL, we make sure they are always included in the WAL events.

To illustrate the above, let’s look at an example using wal2json for convenience. An update event looks something like this:

{
  "kind": "update",
  "schema": "xata",
  "table": "foo",
  "columnnames": ["id", "name", "age"],
  "columntypes": ["text", "text", "integer"],
  "columnvalues": ["michael", "Michael Scott", 42],
  "oldkeys": {
    "keynames": ["id"],
    "keytypes": ["text"],
    "keyvalues": ["michael"]
  }
}

The above was generated after executing a command like the following:

UPDATE foo SET age=age+1 WHERE id='michael';

The table has a primary key (the id column) and that is used as the replica identity.

Things to note:

  • Under columnvalues the new values for the id, name, and agecolumns are included.
  • Under oldkeys only the id column is included, because that’s what the REPLICA IDENTITY is set to.

The table, however, has another column, named description which is not included at all in the event. Why? Because the value of it is large and it is TOASTed. If the UPDATE command would have changed the description, then it would have been in the columnvalues but because it was not touched, it is skipped.

Now let’s compare it with the same event generated with REPLICA IDENTITY FULL:

ALTER TABLE foo REPLICA IDENTITY FULL;

The event for the same UPDATE command above looks like this:

{
  "kind": "update",
  "schema": "xata",
  "table": "foo",
  "columnnames": ["id", "name", "age"],
  "columntypes": ["text", "text", "integer"],
  "columnvalues": ["michael", "Michael Scott", 43],
  "oldkeys": {
    "keynames": ["id", "name", "age", "description"],
    "keytypes": ["text", "text", "integer", "text"],
    "keyvalues": ["michael", "Michael Scott", 42, "<redacted very large description>"]
  }
}

Things to note:

  • The columnvalues looks the same. It contains all the columns and their new values, but not the TOASTed column/value pairs.
  • The oldkeys now contains all the columns with their values before the update, including the TOAST value for the description column.

It should be said that with the replica identity set to the primary key or another unique index, you get enough data out in the replication stream to completely sync a Postgres replica in sync. This means that you can overcome the limitations by keeping state outside of Postgres. However, the amount of work that you need to do outside of Postgres might become significant both in terms of resources needed and added code (compared to a single ALTER TABLE statement).

#

Impact on the replica and changes in Postgres 16

The quote mentioned at the beginning of the blog post, with the “very inefficient” wording is actually from the — currently stable — Postgres 15 docs , but it was changed in the Postgres 16 docs — currently under development:

The above hints that the “very inefficient” part refers to the subscriber that needs to identify the rows that were updated or deleted, and that this is being improved in version 16 so that it’s able to use indexed columns even if they are not set explicitly in the replica identity.

What is not clear from the docs above is that also in previous versions of Postgres, if the table has a primary key, it is used by the subscriber to find the affected row. You can see the code here, it looks for a replica index, and if it’s not found, it uses the PK instead.

/*
 * Get replica identity index or if it is not defined a primary key.
 *
 * If neither is defined, returns InvalidOid
 */
static Oid
GetRelationIdentityOrPK(Relation rel)
{
	Oid			idxoid;

	idxoid = RelationGetReplicaIndex(rel);

	if (!OidIsValid(idxoid))
		idxoid = RelationGetPrimaryKeyIndex(rel);

	return idxoid;
}

This is good news because it means that the “very inefficient” case is only relevant to tables that have no primary key.

With Postgres 16, the “very inefficient” case is reduced even further, because now Postgres can use other indexes beyond the PK, even if they are not explicitly set in the REPLICA IDENTITY. The patch for this change can be found here. After the patch, the GetRelationIdentityOrPK function stays the same, but the caller routine makes another effort to find a suitable index to use:

/*
 * Simple case, we already have a primary key or a replica identity index.
 */
idxoid = GetRelationIdentityOrPK(localrel);
if (OidIsValid(idxoid))
return idxoid;

if (remoterel->replident == REPLICA_IDENTITY_FULL)
{
 /*
  * We are looking for one more opportunity for using an index. If
  * there are any indexes defined on the local relation, try to pick a
  * suitable index.
  *
  * The index selection safely assumes that all the columns are going
  * to be available for the index scan given that remote relation has
  * replica identity full.
  *
  * Note that we are not using the planner to find the cheapest method
  * to scan the relation as that would require us to either use lower
  * level planner functions which would be a maintenance burden in the
  * long run or use the full-fledged planner which could cause
  * overhead.
  */
  return FindUsableIndexForReplicaIdentityFull(localrel, attrMap);
}

While the worst potential impact of REPLICA IDENTITY FULL is on the replica, as we’ve seen above, it does also impact the primary because it requires more data to be stored in the WAL and it causes it to send more data over the network to the subscribers. This leads to increased IO bandwidth consumption, higher CPU usage, and more RAM usage.

How much more? It depends on the type of write traffic you have and how many large values you have, as well as the frequency of updates to those values.

Let’s look at it by the event type:

Inserts are not impacted at all, because there are no “old values” and the large values are included regardless of the REPLICA IDENTITY setting.

If REPLICA IDENTITY is set to FULL, Postgres needs to keep the old values for all columns while executing the transaction and then write them to the WAL.

If there are TOAST values in the row, they need to be loaded in memory, decompressed, and then logged to the WAL. The relevant code that loads them to the memory is this, with most of the hard work happening in toast_flatten_tuple.

if (replident == REPLICA_IDENTITY_FULL)
{
  /*
   * When logging the entire old tuple, it very well could contain
   * toasted columns. If so, force them to be inlined.
   */
  if (HeapTupleHasExternal(tp))
  {
    *copy = true;
    tp = toast_flatten_tuple(tp, desc);
  }
  return tp;
}

In conclusion, the size of the WAL events for updates is approximately doubled if there are no TOASTed values, and potentially way more than doubled if there are TOASTed values.

For deletes, normally only the key is logged in the WAL, and by enabling REPLICA IDENTITY FULL, all columns become part of the key, meaning that the whole row is logged and sent over the wire on delete.

Based on the analysis above, we were guessing that the cost of REPLICA IDENTITY FULL would be worth it in our case, so we decided to put it to test! We wanted to simulate something close to the worst case scenario (but still realistic), so we have extended our benchmarking test suite to have a test with a lot of large values and performing a lot of updates. The test didn’t push the resources to the limit but was heavy enough for us to measure the impact.

Here are the results of running the benchmark test before and after enabling REPLICA IDENTITY FULL.

CPU time before and after
CPU time before and after
Replication slot lag before and after
Replication slot lag before and after

The CPU time showed a modest increase in peak times from 30% to about 35%. There was a similar modest increase in the peak replication slot lag.

If you’re using logical replication between instances, the impact from REPLICA IDENTITY FULL will likely be manageable as long as the replicated tables have a primary key. If you are on Postgres 16 already, having no primary key but another unique index might also be okay.

In any case, there's an impact in the amount of data written to the WAL and sent over the network. The more UPDATE and DELETE traffic that you have, the more significant this impact will be.

If you find the above interesting and want to work on similar challenges, check out the Xata careers page.

Comment on Hacker News.

Start free,
pay as you grow

Xata provides the best free plan in the industry. It is production ready by default and doesn't pause or cool-down. Take your time to build your business and upgrade when you're ready to scale.

Free plan includes
  • 10 database branches
  • High availability
  • 15 GB data storage
  • 15 GB search engine storage
  • 2 GB file attachments
  • 250 AI queries per month
Start freeExplore all plans
Free plan includes
  • 10 database branches
  • High availability
  • 15 GB data storage
  • 15 GB search engine storage
  • 2 GB file attachments
  • 250 AI queries per month
Add column to table

Copyright © 2024 Xatabase Inc.
All rights reserved.

Product

RoadmapFeature requestsPricingStatusAI solutionsFile attachments