pgstream v1.1.0: Steps towards turning it into a service

What's new in pgstream v1.1.0: env vars in YAML config, JSON logs, Kubernetes health probes, custom TLS for OpenSearch, and fixes for composite primary keys.

By:

Noémi Ványi

Published:

Reading time:

6 min read

pgstream v1.0.0 flipped a switch on how we replicate schema changes: schema logs are gone, and DDL flows directly through the WAL as logical messages. That release was about the model.

v1.1.0 is about polishing the rough edges. We focused on the things operators run into on the way to production:

  • secrets in config files
  • JSON format logging
  • readiness and liveness probes
  • data types pgstream couldn't yet replicate
  • TLS settings for search stores (OpenSearch, Elasticsearch)

Also, a handful of correctness bugs that only show up under realistic workloads (composite primary keys, snapshot-then-update flows, concurrent transformer use). We are happy to report that significant features and bugfixes are coming from you, the community. We are grateful for all your contributions, be it opening an issue or submitting pull requests.

This post walks through the highlights since v1.0.3.

Environment variables in YAML config

In the past you had two options to pass secrets to pgstream. You could either set them via an environment variable of pgstream or add them to your configuration file. If you wanted to keep every setting in a configuration file, you had to hardcode secrets into the file.

Unfortunately, the underlying YAML library doesn't support environment-variable expansion natively. So the only way to work around it was with Helm templating, or by switching to env-var-based config completely.

v1.1.0 closes the gap. pgstream now expands ${VAR} references inside YAML configuration files before parsing, so secrets live wherever you already manage them. Be it Kubernetes secrets, Vault, your shell, whatever:

Logging: JSON output and color control

Previously pgstream emitted color-coded console logs unconditionally as it is a CLI tool. When pgstream is running as a service, this format is unhelpful, e.g. when you're forwarding logs to Loki, Datadog, or any other aggregator that wants structured input. Two new options let you configure logging:

  • --log-format console|json (env: PGSTREAM_LOG_FORMAT)
  • --no-color (env: PGSTREAM_LOG_NO_COLOR)

These apply to the run and snapshot commands.

Health endpoints for orchestrators

Running pgstream under Kubernetes had an observability gap because it was missing readiness and liveness probes. v1.1.0 adds an opt-in HTTP server exposing two endpoints:

  • /health For liveness. Returns 200 as long as the process is up.
  • /ready For readiness. When a Postgres source is configured, it runs a SELECT 1 against it with 2 second timeout; failure returns 503 with the underlying error. For Kafka sources or pipelines without a wired source, it collapses to liveness.

Both respond with JSON including the running version:

Enable it in the config file:

or via env: PGSTREAM_HEALTH_CHECK_ENABLED=true, PGSTREAM_HEALTH_CHECK_ADDRESS=0.0.0.0:9910. The endpoints are honored by both run and snapshot. Default bind address is localhost:9910, so you'll need to flip it to 0.0.0.0:9910 (or front it with a sidecar) for the kubelet to reach it from outside the pod.

A stdout target for debugging pipelines

Issue #165 had been open since the early days: a way to see what's actually flowing through the pipeline without standing up a downstream target. v1.1.0 adds a stdout target writer that emits one NDJSON object per WAL event:

or PGSTREAM_STDOUT_WRITER_ENABLED=true. Events are flushed immediately and each one advances the checkpoint, so the source's LSN/offset bookkeeping behaves identically to a real target. It's mutually exclusive with the other target writers. You can use it for validating filter rules, transformer output, or schema mappings before pointing pgstream at a real database or search cluster.

New type support: ltree, cube, pgvector and large integers

Three type-coverage improvements, each with a different shape of fix:

ltree

pgstream now replicates ltree columns end-to-end, including under bulk ingest. The source registers a text codec for ltree, and the bulk-ingest path routes affected batches through the text COPY protocol so values land correctly on the target.

cube

pgx's binary COPY ... FROM STDIN BINARY doesn't know how to encode cube, so snapshots of tables with cube columns used to fail with copy from: program limit exceeded: cube dimension is too large.

pgvector

Vector columns are now registered on every connection in the pool, so snapshots and replication of vector columns work consistently.

Large integers and bigint

There was a silent data corruption with no error, no warning, no log entry. wal2json decodes JSON integers into float64 by default, and float64 only has 53 bits of mantissa, so anything past 2⁵³ (≈ 9 × 10¹⁵) was rounded somewhere between source and target. For a column like id bigint PRIMARY KEY, that meant UPDATEs and DELETEs could target the wrong row on the destination. Snowflake-style IDs, epoch-millisecond timestamps stored as bigint, and large accumulating counters were all in the blast radius. The same root cause also affected large integers nested inside jsonb payloads during snapshots and streaming (#686).

Replication correctness fixes

Composite primary keys in WAL metadata

If your target was Postgres with on_conflict_action: update, and your table looked like this:

pgstream would build ON CONFLICT ("sensor_id") DO UPDATE SET ... using only the first PK column. Then Postgres would reject it with "there is no unique or exclusion constraint matching the ON CONFLICT specification".

v1.1.0 stores the full primary-key column set, so the writer builds the correct ON CONFLICT (...) target and replication just works for tables with composite PKs.

Restore constraints before snapshot updates

The schema snapshot deliberately restores indexes and constraints after data, because that's the right ordering for bulk loading. But the non-bulk path with on_conflict_action: update writes rows with INSERT ... ON CONFLICT DO UPDATE, and that statement needs the conflict target.

v1.1.0 splits the restore: when the snapshot is using the regular batch writer with conflict-update mode, primary keys, unique constraints, and unique indexes restore before the data snapshot runs. Foreign keys, regular indexes, and triggers still come after data, so bulk-load performance and FK semantics are unchanged.

Custom TLS for search stores

OpenSearch 3 made HTTPS mandatory. Self-hosted clusters with self-signed certificates used to be a non-starter for pgstream . The search clients went through http.DefaultTransport with no TLS configuration hook, so even a correctly-configured cluster would fail with failed to verify certificate: x509: certificate signed by unknown authority. v1.1.0 adds a full TLS block:

The same five fields are available via PGSTREAM_SEARCH_TLS_* environment variables.

Upgrading

v1.1.0 is a drop-in upgrade from v1.0.x. The only behavioral change you might trip on is the stricter byte-size parsing: if you had a malformed *_BYTES setting that was being silently ignored, the upgraded pgstream will now refuse to start until you fix it. Skim your config for _BYTES keys before you roll forward.

For installations still on v0.9.x, v1.1.0 adds an --upgrade flag to both init and run (#788). It cleans up the legacy v0.9 metadata objects and reinitializes the v1.x schema in a single step, replacing the manual pgstream destroy + pgstream init dance that earlier v1.x releases required when coming from v0.9.

Thanks

As I mentioned in the introduction, this release pulls in fixes and features from across the community. We would like to thank everyone who reported issues and submitted pull requests.

What's next

We're working on a few larger pieces for upcoming releases to further support running pgstream as a service: preflight checks, resumable snapshots, broader replication coverage. Watch the issue tracker for the work in flight, and the docs for configuration details on everything covered above.

If you hit something or you have a request, open an issue, or file a PR. We read everything.

Share

Give every agentic workload its own Postgres branch

Create instant database clones with production-like data for every agent, workflow, and CI/CD pipeline.

Related Posts