Installation
Global Flags
These flags are available for all commands:--config, -c
--config, -c
Description: .env or .yaml config file to use with pgstream if any
Default: -
Default: -
--log-level
--log-level
Description: Log level for the application. One of trace, debug, info, warn, error, fatal, panic
Default:
Default:
debug--help, -h
--help, -h
Description: Show help information
Default: -
Default: -
Commands
init
Initialises pgstream, creating the replication slot and the relevant tables/functions/triggers under the configured internal pgstream schema. It performs the same operations as the--init flag on the run command.
init command prepares your PostgreSQL database for streaming by:
- Creating a logical replication slot with the specified name
- Creating the internal pgstream schema for tracking changes
- Setting up necessary functions and triggers for change data capture
- Configuring the database objects required for logical replication
- PostgreSQL must have
wal_level = logical - User must have replication privileges (
REPLICATIONrole) max_replication_slotsmust allow for additional slots- User must have privileges to create schemas, functions, and triggers
--postgres-url- Source postgres URL where pgstream setup will be run--replication-slot- Name of the postgres replication slot to be created by pgstream on the source url
run
Run starts a continuous data stream from the configured source to the configured target.run command is the main operation mode for pgstream. It:
- Establishes a connection to the source database
- Connects to the existing replication slot (created by
init) - Continuously reads WAL events from the replication stream
- Processes and transforms data according to configuration
- Streams changes to configured targets (Kafka, PostgreSQL, Elasticsearch, OpenSearch)
- Runs continuously until interrupted (Ctrl+C) or receives a termination signal
- Gracefully shuts down on SIGTERM/SIGINT
- Resumes from the last confirmed WAL position
- Database must be initialized with
pgstream init - Replication slot must exist and be available
- Target systems must be accessible and properly configured
- Source database must have logical replication enabled
--source- Source type. One of postgres, kafka--source-url- Source URL--target- Target type. One of postgres, opensearch, elasticsearch, kafka--target-url- Target URL--replication-slot- Name of the postgres replication slot for pgstream to connect to--snapshot-tables- List of tables to snapshot if initial snapshot is required, in the format<schema>.<table>. If not specified, the schemapublicwill be assumed. Wildcards are supported--reset- Whether to reset the target before snapshotting (only for postgres target)--profile- Whether to expose a /debug/pprof endpoint on localhost:6060--init- Whether to initialize pgstream before starting replication--dump-file- File where the pg_dump output will be written if initial snapshot is enabled when using pgdump/restore
--profile is enabled):
cpu.prof- CPU profiling data for performance analysismem.prof- Memory allocation profiling data
snapshot
Snapshot performs a snapshot of the configured source Postgres database into the configured target.snapshot command creates a point-in-time copy of database tables. It:
- Connects to the source PostgreSQL database
- Reads all existing data from specified tables/schemas
- Transforms and streams the data to configured targets
- Exits after completing the snapshot operation
- Source PostgreSQL database must be accessible
- Target system must be accessible and properly configured
- User must have SELECT privileges on tables to be snapshotted
- For PostgreSQL targets: user must have write privileges
--postgres-url- Source postgres database to perform the snapshot from--target- Target type. One of postgres, opensearch, elasticsearch, kafka--target-url- Target URL--tables- List of tables to snapshot, in the format<schema>.<table>. If not specified, the schemapublicwill be assumed. Wildcards are supported--reset- Whether to reset the target before snapshotting (only for postgres target)--profile- Whether to produce CPU and memory profile files, as well as exposing a /debug/pprof endpoint on localhost:6060--dump-file- File where the pg_dump output will be written
- Bulk data export for analytics
- Creating test datasets
- Backfilling data after system setup
--profile is enabled):
cpu.prof- CPU profiling data for performance analysismem.prof- Memory allocation profiling data
status
Checks the status of pgstream initialisation and provided configuration.status command provides information about:
- Replication slot status
- Internal pgstream schema and objects status
- Overall streaming infrastructure health
- Configuration validation results
- Access to the source PostgreSQL database
- Replication slot should exist
--postgres-url- Source postgres URL where pgstream has been initialised--replication-slot- Name of the postgres replication slot created by pgstream on the source url--json- Output the status in JSON format
validate
Validate different parts of the pgstream configuration.validate command allows you to validate specific aspects of your pgstream configuration before running it. Currently supports validating transformation rules.
validate rules
Validates transformation rules against the provided Postgres database schema.validate rules command checks your transformation rules for:
- Column existence and type compatibility
- Table and schema references
- Rule syntax and structure
- Compatibility with the source database schema
- Overall validity before applying them in production
- Access to the source PostgreSQL database
- Transformation rules defined in configuration or separate rules file
--postgres-url- Source postgres URL to validate the rules against--rules-file,-f- Path to a YAML file containing the transformation rules to validate--json- Output the validation status in JSON format
- Pre-deployment validation of transformation rules
- Testing rule changes against production schema
- CI/CD pipeline integration for rule validation
- Debugging transformation rule issues
destroy
It destroys any pgstream setup, removing the replication slot and all the relevant tables/functions/triggers, along with the internal pgstream schema.destroy command cleans up all resources created by pgstream init:
- Drops the replication slot
- Removes the internal pgstream schema and all its objects
- Removes all pgstream-related functions and triggers
- ⚠️ Warning: This is destructive and will lose replication position
- Access to the source PostgreSQL database
- User must have privileges to drop schemas, functions, and replication slots
- pgstream should be initialized (objects should exist to be destroyed)
--postgres-url- Source postgres URL where pgstream destroy will be run--replication-slot- Name of the postgres replication slot to be deleted by pgstream from the source url
- This will stop any running pgstream instances using these resources
- You will lose the current replication position
- All pgstream tracking data will be permanently removed
- You can recreate resources later with
pgstream init
version
Displays version information for pgstream.Configuration
pgstream uses YAML or .env configuration files. The configuration can be specified via:- Command-line flag:
--config /path/to/config.yamlor-c /path/to/config.env - Environment variable:
PGSTREAM_CONFIG=/path/to/config.yaml
Examples
Complete Setup Workflow
Configuration File Workflow
Development with Profiling
Multi-target Streaming
Environment Variable Configuration
Troubleshooting
Common Issues
1. Initialization FailuresCommand-specific Troubleshooting
Init Command:Best Practices
- Use
runwith the--initflag to ensure pgstream can properly replicate schema changes snapshotonly requires read access on your Postgres source, so it’s a good alternative for non invasive syncs- If you need a snapshot and replication, use
runwith initial snapshot to prevent data loss - Use the
statuscommand to validate your transformer configuration and your pgstream replication setup - Use
destroycarefully, since it will remove everything used by pgstream, including the replication slot
Getting Help
For immediate help with any command:- GitHub Issues: https://github.com/xataio/pgstream/issues
- Documentation: https://github.com/xataio/pgstream/docs