
1. Prerequisites
- A Xata account (sign up here)
- The Xata CLI installed:
- A PostgreSQL database with:
- Logical replication enabled
- Role with permissions to create a replication slow (
xata clone stream
command does that automatically) - Network connectivity from Xata to your database
2. Enable logical replication on source database
First, ensure your source PostgreSQL database has logical replication enabled. You’ll need to set these parameters:3. Create a Xata project and branch
In the Console, create a new project and then click the Create main branch button to create the PostgreSQL instance. For streaming replication, consider using at least 1 replica to ensure high availability during continuous synchronization. Select an instance size that can handle your expected write throughput.Note: Streaming replication maintains a persistent connection to your source database. Ensure your network allows stable, long-lived connections between Xata and your PostgreSQL instance.
4. Configure the Xata CLI
Authenticate the CLI by running:5. Configure streaming replication
Generate a configuration for the streaming process:CONN_STRING
is your PostgreSQL connection string with replication permissions.
The configuration prompt will ask you to:
- Select tables to replicate
- Set up transformation pipelines i.e. anonymization rules
.xata/clone.yaml
that you can further customize.
6. Initialize and start streaming
- Create an initial snapshot of your specified tables
- Set up the streaming pipeline
- Begin continuous replication
7. Advanced configuration
Filtering specific tables
To stream only specific tables, use the--filter-tables
flag:
*.*
Custom transformations
Edit your.xata/clone.yaml
file to add custom transformations:
Running with Docker
For production deployments, consider running the streaming process in a containerized environment:10. Handling failures and recovery
If the streaming connection is interrupted, the replication slot ensures no data is lost. Simply restart the streaming command:xata clone stream
process and do not wish to run streaming replication again, clean up the replication slot and
other pgstream
objects using xata stream destroy
command.
Not cleaning up the replication slot will cause the WAL to be aggregated continuously and that would lead to full disk space. Use options like max_slot_wal_keep_size
to keep the max WAL size in check.
Summary
- You now have real-time streaming replication (Postgres’s logical replication) from your PostgreSQL database to Xata
- Changes in your source database are automatically synchronized
- Your data can be anonymized in transit using configurable transformers
- The replication slot ensures no data loss during network interruptions