Analyzing the performance impact of REPLICA IDENTITY FULL by looking at the PostgreSQL source code and performing benchmarks.
Written by
Tudor Golubenco
Published on
June 7, 2023
Hey there, you are on the Xata engineering blog. Xata is a serverless data platform that makes building on top of PostgreSQL. Sign up today!
At Xata, we've come to rely on the Postgres logical replication events quite a bit. Why? Because they provide us with a dependable stream of events about every single change that occurs within the database. We use them for syncing the data to Elasticsearch, web-hooks, deleting attachments after the parent row was deleted, and more. One of the configuration parameters for logical replication is the so called “replica identity”:
A published table must have a “replica identity” configured in order to be able to replicate
UPDATE
andDELETE
operations, so that appropriate rows to update or delete can be identified on the subscriber side. By default, this is the primary key, if there is one.
You’d typically use the primary key or an index as the “replica identity” and avoid setting the “replica identity” to full. The reason is this quote from the Postgres docs (emphasis mine):
If the table does not have any suitable key, then it can be set to replica identity “full”, which means the entire row becomes the key. This, however, is very inefficient and should only be used as a fallback if no other solution is possible.
That sounds quite discouraging, however, there are some good reasons for which you might still want to consider enabling it (see next section), so it’s worth digging a bit deeper and analyzing what exactly happens when you enable it and how it affects performance. This is what this blog post is about.
In our case there were two things that convinced us that we might want to consider REPLICA IDENTITY FULL:
To illustrate the above, let’s look at an example using wal2json for convenience. An update event looks something like this:
{
"kind": "update",
"schema": "xata",
"table": "foo",
"columnnames": ["id", "name", "age"],
"columntypes": ["text", "text", "integer"],
"columnvalues": ["michael", "Michael Scott", 42],
"oldkeys": {
"keynames": ["id"],
"keytypes": ["text"],
"keyvalues": ["michael"]
}
}
The above was generated after executing a command like the following:
UPDATE foo SET age=age+1 WHERE id='michael';
The table has a primary key (the id
column) and that is used as the replica identity.
Things to note:
columnvalues
the new values for the id
, name
, and age
columns are included.oldkeys
only the id
column is included, because that’s what the REPLICA IDENTITY
is set to.The table, however, has another column, named description
which is not included at all in the event. Why? Because the value of it is large and it is TOASTed. If the UPDATE command would have changed the description, then it would have been in the columnvalues
but because it was not touched, it is skipped.
Now let’s compare it with the same event generated with REPLICA IDENTITY FULL
:
ALTER TABLE foo REPLICA IDENTITY FULL;
The event for the same UPDATE
command above looks like this:
{
"kind": "update",
"schema": "xata",
"table": "foo",
"columnnames": ["id", "name", "age"],
"columntypes": ["text", "text", "integer"],
"columnvalues": ["michael", "Michael Scott", 43],
"oldkeys": {
"keynames": ["id", "name", "age", "description"],
"keytypes": ["text", "text", "integer", "text"],
"keyvalues": ["michael", "Michael Scott", 42, "<redacted very large description>"]
}
}
Things to note:
columnvalues
looks the same. It contains all the columns and their new values, but not the TOASTed column/value pairs.oldkeys
now contains all the columns with their values before the update, including the TOAST value for the description
column.It should be said that with the replica identity set to the primary key or another unique index, you get enough data out in the replication stream to completely sync a Postgres replica in sync. This means that you can overcome the limitations by keeping state outside of Postgres. However, the amount of work that you need to do outside of Postgres might become significant both in terms of resources needed and added code (compared to a single ALTER TABLE statement).
The quote mentioned at the beginning of the blog post, with the “very inefficient” wording is actually from the — currently stable — Postgres 15 docs , but it was changed in the Postgres 16 docs — currently under development:
If the table does not have any suitable key, then it can be set to replica identity
FULL
, which means the entire row becomes the key. When replica identityFULL
is specified, indexes can be used on the subscriber side for searching the rows. […] If there are no such suitable indexes, the search on the subscriber side can be very inefficient, therefore replica identityFULL
should only be used as a fallback if no other solution is possible.
The above hints that the “very inefficient” part refers to the subscriber that needs to identify the rows that were updated or deleted, and that this is being improved in version 16 so that it’s able to use indexed columns even if they are not set explicitly in the replica identity.
What is not clear from the docs above is that also in previous versions of Postgres, if the table has a primary key, it is used by the subscriber to find the affected row. You can see the code here, it looks for a replica index, and if it’s not found, it uses the PK instead.
/*
* Get replica identity index or if it is not defined a primary key.
*
* If neither is defined, returns InvalidOid
*/
static Oid
GetRelationIdentityOrPK(Relation rel)
{
Oid idxoid;
idxoid = RelationGetReplicaIndex(rel);
if (!OidIsValid(idxoid))
idxoid = RelationGetPrimaryKeyIndex(rel);
return idxoid;
}
This is good news because it means that the “very inefficient” case is only relevant to tables that have no primary key.
With Postgres 16, the “very inefficient” case is reduced even further, because now Postgres can use other indexes beyond the PK, even if they are not explicitly set in the REPLICA IDENTITY
. The patch for this change can be found here. After the patch, the GetRelationIdentityOrPK
function stays the same, but the caller routine makes another effort to find a suitable index to use:
/*
* Simple case, we already have a primary key or a replica identity index.
*/
idxoid = GetRelationIdentityOrPK(localrel);
if (OidIsValid(idxoid))
return idxoid;
if (remoterel->replident == REPLICA_IDENTITY_FULL)
{
/*
* We are looking for one more opportunity for using an index. If
* there are any indexes defined on the local relation, try to pick a
* suitable index.
*
* The index selection safely assumes that all the columns are going
* to be available for the index scan given that remote relation has
* replica identity full.
*
* Note that we are not using the planner to find the cheapest method
* to scan the relation as that would require us to either use lower
* level planner functions which would be a maintenance burden in the
* long run or use the full-fledged planner which could cause
* overhead.
*/
return FindUsableIndexForReplicaIdentityFull(localrel, attrMap);
}
While the worst potential impact of REPLICA IDENTITY FULL
is on the replica, as we’ve seen above, it does also impact the primary because it requires more data to be stored in the WAL and it causes it to send more data over the network to the subscribers. This leads to increased IO bandwidth consumption, higher CPU usage, and more RAM usage.
How much more? It depends on the type of write traffic you have and how many large values you have, as well as the frequency of updates to those values.
Let’s look at it by the event type:
Inserts are not impacted at all, because there are no “old values” and the large values are included regardless of the REPLICA IDENTITY
setting.
If REPLICA IDENTITY
is set to FULL
, Postgres needs to keep the old values for all columns while executing the transaction and then write them to the WAL.
If there are TOAST values in the row, they need to be loaded in memory, decompressed, and then logged to the WAL. The relevant code that loads them to the memory is this, with most of the hard work happening in toast_flatten_tuple
.
if (replident == REPLICA_IDENTITY_FULL)
{
/*
* When logging the entire old tuple, it very well could contain
* toasted columns. If so, force them to be inlined.
*/
if (HeapTupleHasExternal(tp))
{
*copy = true;
tp = toast_flatten_tuple(tp, desc);
}
return tp;
}
In conclusion, the size of the WAL events for updates is approximately doubled if there are no TOASTed values, and potentially way more than doubled if there are TOASTed values.
For deletes, normally only the key is logged in the WAL, and by enabling REPLICA IDENTITY FULL
, all columns become part of the key, meaning that the whole row is logged and sent over the wire on delete.
Based on the analysis above, we were guessing that the cost of REPLICA IDENTITY FULL
would be worth it in our case, so we decided to put it to test! We wanted to simulate something close to the worst case scenario (but still realistic), so we have extended our benchmarking test suite to have a test with a lot of large values and performing a lot of updates. The test didn’t push the resources to the limit but was heavy enough for us to measure the impact.
Here are the results of running the benchmark test before and after enabling REPLICA IDENTITY FULL
.
The CPU time showed a modest increase in peak times from 30% to about 35%. There was a similar modest increase in the peak replication slot lag.
If you’re using logical replication between instances, the impact from REPLICA IDENTITY FULL
will likely be manageable as long as the replicated tables have a primary key. If you are on Postgres 16 already, having no primary key but another unique index might also be okay.
In any case, there's an impact in the amount of data written to the WAL and sent over the network. The more UPDATE
and DELETE
traffic that you have, the more significant this impact will be.
If you find the above interesting and want to work on similar challenges, check out the Xata careers page.
Comment on Hacker News.
Xata provides the best free plan in the industry. It is production ready by default and doesn't pause or cool-down. Take your time to build your business and upgrade when you're ready to scale.
Copyright © 2024 Xatabase Inc.
All rights reserved.