Installation
Global Flags
These flags are available for all commands:--config, -c
--config, -c
Description: .env or .yaml config file to use with pgstream if any
Default: -
Default: -
--log-level
--log-level
Description: Log level for the application. One of trace, debug, info, warn, error, fatal, panic
Default:
Default:
debug--help, -h
--help, -h
Description: Show help information
Default: -
Default: -
Commands
init
Initialises pgstream, creating the replication slot and the relevant tables/functions/triggers under the configured internal pgstream schema. It performs the same operations as the--init flag on the run command.
init command prepares your PostgreSQL database for streaming by:
- Creating a logical replication slot with the specified name
- Creating the internal pgstream schema for tracking changes
- Setting up necessary functions and triggers for change data capture
- Configuring the database objects required for logical replication
- PostgreSQL must have
wal_level = logical - User must have replication privileges (
REPLICATIONrole) max_replication_slotsmust allow for additional slots- User must have privileges to create schemas, functions, and triggers
--postgres-url- Source postgres URL where pgstream setup will be run--replication-slot- Name of the postgres replication slot to be created by pgstream on the source url
run
Run starts a continuous data stream from the configured source to the configured target.run command is the main operation mode for pgstream. It:
- Establishes a connection to the source database
- Connects to the existing replication slot (created by
init) - Continuously reads WAL events from the replication stream
- Processes and transforms data according to configuration
- Streams changes to configured targets (Kafka, PostgreSQL, Elasticsearch, OpenSearch)
- Runs continuously until interrupted (Ctrl+C) or receives a termination signal
- Gracefully shuts down on SIGTERM/SIGINT
- Resumes from the last confirmed WAL position
- Database must be initialized with
pgstream init - Replication slot must exist and be available
- Target systems must be accessible and properly configured
- Source database must have logical replication enabled
--source- Source type. One of postgres, kafka--source-url- Source URL--target- Target type. One of postgres, opensearch, elasticsearch, kafka--target-url- Target URL--replication-slot- Name of the postgres replication slot for pgstream to connect to--snapshot-tables- List of tables to snapshot if initial snapshot is required, in the format<schema>.<table>. If not specified, the schemapublicwill be assumed. Wildcards are supported--reset- Whether to reset the target before snapshotting (only for postgres target)--profile- Whether to expose a /debug/pprof endpoint on localhost:6060--init- Whether to initialize pgstream before starting replication--dump-file- File where the pg_dump output will be written if initial snapshot is enabled when using pgdump/restore
--profile is enabled):
cpu.prof- CPU profiling data for performance analysismem.prof- Memory allocation profiling data
snapshot
Snapshot performs a snapshot of the configured source Postgres database into the configured target.snapshot command creates a point-in-time copy of database tables. It:
- Connects to the source PostgreSQL database
- Reads all existing data from specified tables/schemas
- Transforms and streams the data to configured targets
- Exits after completing the snapshot operation
- Source PostgreSQL database must be accessible
- Target system must be accessible and properly configured
- User must have SELECT privileges on tables to be snapshotted
- For PostgreSQL targets: user must have write privileges
--postgres-url- Source postgres database to perform the snapshot from--target- Target type. One of postgres, opensearch, elasticsearch, kafka--target-url- Target URL--tables- List of tables to snapshot, in the format<schema>.<table>. If not specified, the schemapublicwill be assumed. Wildcards are supported--reset- Whether to reset the target before snapshotting (only for postgres target)--profile- Whether to produce CPU and memory profile files, as well as exposing a /debug/pprof endpoint on localhost:6060--dump-file- File where the pg_dump output will be written
- Bulk data export for analytics
- Creating test datasets
- Backfilling data after system setup
--profile is enabled):
cpu.prof- CPU profiling data for performance analysismem.prof- Memory allocation profiling data
status
Checks the status of pgstream initialisation and provided configuration.status command provides information about:
- Replication slot status
- Internal pgstream schema and objects status
- Overall streaming infrastructure health
- Configuration validation results
- Access to the source PostgreSQL database
- Replication slot should exist
--postgres-url- Source postgres URL where pgstream has been initialised--replication-slot- Name of the postgres replication slot created by pgstream on the source url--json- Output the status in JSON format
destroy
It destroys any pgstream setup, removing the replication slot and all the relevant tables/functions/triggers, along with the internal pgstream schema.destroy command cleans up all resources created by pgstream init:
- Drops the replication slot
- Removes the internal pgstream schema and all its objects
- Removes all pgstream-related functions and triggers
- ⚠️ Warning: This is destructive and will lose replication position
- Access to the source PostgreSQL database
- User must have privileges to drop schemas, functions, and replication slots
- pgstream should be initialized (objects should exist to be destroyed)
--postgres-url- Source postgres URL where pgstream destroy will be run--replication-slot- Name of the postgres replication slot to be deleted by pgstream from the source url
- This will stop any running pgstream instances using these resources
- You will lose the current replication position
- All pgstream tracking data will be permanently removed
- You can recreate resources later with
pgstream init
version
Displays version information for pgstream.Configuration
pgstream uses YAML or .env configuration files. The configuration can be specified via:- Command-line flag:
--config /path/to/config.yamlor-c /path/to/config.env - Environment variable:
PGSTREAM_CONFIG=/path/to/config.yaml
Examples
Complete Setup Workflow
Configuration File Workflow
Development with Profiling
Multi-target Streaming
Environment Variable Configuration
Troubleshooting
Common Issues
1. Initialization FailuresCommand-specific Troubleshooting
Init Command:Best Practices
- Use
runwith the--initflag to ensure pgstream can properly replicate schema changes snapshotonly requires read access on your Postgres source, so it’s a good alternative for non invasive syncs- If you need a snapshot and replication, use
runwith initial snapshot to prevent data loss - Use the
statuscommand to validate your transformer configuration and your pgstream replication setup - Use
destroycarefully, since it will remove everything used by pgstream, including the replication slot
Getting Help
For immediate help with any command:- GitHub Issues: https://github.com/xataio/pgstream/issues
- Documentation: https://github.com/xataio/pgstream/docs