Introduction
This tutorial will showcase the use of pgstream to replicate data from a PostgreSQL database to a webhook server. You can also check out this blogpost explaining how to use pgstream with webhooks.Requirements
- A source PostgreSQL database
- A target webhook server
- pgstream (see installation instructions for more details)
Demo
Environment setup
The first step is to start the PostgreSQL database that will be used as source for replication. Thepgstream repository provides a docker installation that will be used for the purposes of this tutorial, but can be replaced by any available PostgreSQL server, as long as it has wal2json installed.
To start the docker provided PostgreSQL server, run the following command:
5432.
Database initialisation
Once the PostgreSQL server is up and running, the next step is to initialise pgstream. This will create thepgstream schema in the configured Postgres database, along with the tables/functions/triggers required to keep track of the schema changes. See Tracking schema changes section for more details. This step will also create a replication slot on the source database which will be used by the pgstream service.
The initialisation step allows to provide both the URL of the PostgreSQL database and the name of the replication slot to be created. The PostgreSQL URL is required, but the replication slot name is optional. If not provided, it will default to pgstream_<dbname>_slot, where <dbname> is the name of the PostgreSQL database. The configuration can be provided either by using the CLI supported parameters, or using the environment variables.
For this tutorial, we’ll create a replication slot with the name pgstream_tutorial_slot.
-
Using the
--initflag in theruncommand - Using CLI parameters:
- Using environment variables:
destroy CLI command.
Prepare pgstream configuration
Listener
In order to run pgstream, we need to provide the configuration required to run the PostgreSQL to webhook server replication. First, we configure the listener module that will be listening to the WAL on the source PostgreSQL database. This requires the PostgreSQL database URL, which will be the one from the docker PostgreSQL server we started and setup in the previous steps.Processor
With the listener side ready, the next step is to configure the processor. Since we want to replicate to a webhook server, we need to configure the webhooks processor. The webhooks processor runs a server that accepts webhook subscriptions, to keep track of which webhooks need to be called when an event is received. We need to configure where we want that subscriptions table to be stored, usually the source database will be a good option.pg2webhook_tutorial.env. An equivalent pg2webhook_tutorial.yaml configuration can be found below the environment one, and can be used interchangeably.
- Without initial snapshot
- With initial snapshot
Validate pgstream status
We can validate that the initialisation and the configuration are valid by running the status command before starting pgstream.
Run pgstream
With the configuration ready, we can now run pgstream. In this case we set the log level as trace to provide more context for debugging and have more visibility into what pgstream is doing under the hood.
/tools/webhook directory, which will listen on localhost:9910. This dummy webhook server just prints the events received to output in JSON format for validation.
http://localhost:9900/webhooks/subscribe, so that’s where we should send the requests. In this example we’ll subscribe to insert events for all tables.
I(insert), U(update), D(delete) and T(truncate). If no events are provided, all will be included. We can also subscribe to a specific schema or table by providing them in the request body. If they are not provided, all will be included.
For example, if we wanted to subscribe to all events for the test table of the public schema, the request would look like this:
webhook_subscriptions table to be created. In our case, this is the source PostgreSQL database.
Verify webhook events
Now we can start populating the source PostgreSQL database and receiving events in our webhook server.pgstream.schema_log table that we’ll be notified about, since it matches the subscription parameters.
REPLICA IDENTITY to FULL.
Populate event metadata
In this tutorial we haven’t used the injector to populate the metadata event information, which is why it appears empty in the events. If the webhook notifier requires that metadata information, the processor configuration can be udpated by setting the injector store URL. This is the database that contains thepgstream.schema_log table, which the injector uses to retrieve schema information to populate the metadata (more details can be found in the architecture section). In this case, it’s the source PostgreSQL database.
Troubleshooting
Here are some common issues you might encounter while following this tutorial and how to resolve them:1. Error: Connection refused
- Cause: The PostgreSQL database or webhook server is not running.
- Solution:
- Ensure the Docker containers for the PostgreSQL database and webhook server are running.
- Verify the database and webhook server URLs in the configuration.
2. Error: Replication slot not found
- Cause: The replication slot was not created during initialization.
- Solution:
- Reinitialize
pgstreamor manually create the replication slot. - Run the
pgstream statuscommand to validate the initialisation was successful. - Verify the replication slot exists by running:
- Reinitialize
3. Error: Webhook events not received
- Cause: The webhook subscription was not registered correctly.
- Solution:
- Verify the subscription by querying the
pgstream.webhook_subscriptionstable: - Ensure the webhook server is running and reachable.
- Verify the subscription by querying the
4. Error: Permission denied
- Cause: The database user does not have sufficient privileges.
- Solution:
- Grant the required privileges to the database user:
- Grant the required privileges to the database user:
5. Error: Invalid configuration
- Cause: The configuration file contains invalid or missing values.
- Solution:
- Double-check the
pg2webhook_tutorial.envfile for typos or missing variables. - Refer to the pgstream configuration documentation for details on required variables.
- Run the
pgstream statuscommand to validate the configuration is correct.
- Double-check the
6. Error: Event metadata not populated
- Cause: The injector store URL is not configured, or the schema log is not acknowledged.
- Solution:
- Add the injector store URL to the configuration:
- Trigger a schema change (e.g., rename a table) to acknowledge the schema log:
- Add the injector store URL to the configuration:
7. Error: Stale webhook subscription cache
- Cause: The subscription cache is not refreshing frequently enough.
- Solution:
- Adjust the cache refresh interval in the configuration:
- Adjust the cache refresh interval in the configuration:
Summary
In this tutorial, we successfully configuredpgstream to replicate data from a PostgreSQL database to a webhook server. We:
- Set up the source PostgreSQL database and initialized
pgstream. - Configured the listener to capture changes from the PostgreSQL WAL.
- Configured the processor to send events to a webhook server.
- Verified that database changes triggered webhook events.
- Explored how to update webhook subscriptions and use the injector for metadata enrichment.
pgstream can be used to integrate PostgreSQL with webhook-based systems, enabling real-time event-driven architectures. For more advanced use cases, refer to the other pgstream tutorials.