Kubernetes Cron Task Xata clone (nightly)

Sample Kubernetes cron job to periodically copy the data from production to your staging system.

This contains a sample Kubernetes manifest file for setting up a periodic task to run Xata Clone.

Kubernetes spec

apiVersion: batch/v1
kind: CronJob
metadata:
  name: xata-clone
  namespace: xata-clone          # change if needed
spec:
  schedule: "0 0 * * *"   # change if needed
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: xata-clone
              image: ubuntu:latest
              env:
                - name: XATA_CLI_SOURCE_POSTGRES_URL
                  valueFrom:
                    secretKeyRef:
                      name: xata-secrets
                      key: source-postgres-url
                - name: XATA_API_REFRESH_TOKEN
                  valueFrom:
                    secretKeyRef:
                      name: xata-secrets
                      key: XATA_API_REFRESH_TOKEN
                - name: XATA_ORGANIZATIONID
                  valueFrom:
                    secretKeyRef:
                      name: xata-secrets
                      key: XATA_ORGANIZATIONID
                - name: XATA_PROJECTID
                  valueFrom:
                    secretKeyRef:
                      name: xata-secrets
                      key: XATA_PROJECTID
                - name: XATA_BRANCHNAME
                  valueFrom:
                    secretKeyRef:
                      name: xata-secrets
                      key: XATA_BRANCHNAME
                - name: XATA_DATABASENAME
                  valueFrom:
                    secretKeyRef:
                      name: xata-secrets
                      key: XATA_DATABASENAME
              command:
                - /bin/bash
                - -c
                - |
                  set -e

                  echo "Installing dependencies"
                  apt-get update && \
                    apt-get install -y curl jq perl gnupg lsb-release && \
                    echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list && \
                    curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /usr/share/keyrings/postgresql.gpg && \
                    echo "deb [signed-by=/usr/share/keyrings/postgresql.gpg] http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list && \
                    apt-get update && \
                    apt-get install -y postgresql-client-17

                  echo "Installing Xata CLI"
                  curl -fsSL https://xata.io/install.sh | bash
                  export PATH="$HOME/.config/xata/bin/:$PATH"

                  xata init --organization $XATA_ORGANIZATIONID --project $XATA_PROJECTID --database $XATA_DATABASENAME

                  xata status --json

                  echo "Checking out branch: $XATA_BRANCHNAME"
                  xata branch checkout "$XATA_BRANCHNAME"

                  echo "Updated status:"
                  xata status
                  xata branch view

                  echo "Starting clone..."
                  xata clone start --source-url "$XATA_CLI_SOURCE_POSTGRES_URL" --validation-mode=relaxed
          restartPolicy: OnFailure

Setting up the tasks

Create a dedicated namespace for the task (if you'd like, you can also use an existing one):

kubectl create namespace xata-clone

Step 1: Create secret

The task requires several env variables. Assuming you have the xata CLI already configured on your computer, you can create a secret with all the relevant information like this:

kubectl create secret generic xata-secrets \
  --namespace xata-clone \
  --from-literal=source-postgres-url="$RDS_URI" \
  --from-literal=XATA_API_REFRESH_TOKEN="$(xata auth refresh-token)" \
  --from-literal=XATA_ORGANIZATIONID="$(xata organization get id)" \
  --from-literal=XATA_PROJECTID="$(xata project get id)" \
  --from-literal=XATA_BRANCHNAME="main" \
  --from-literal=XATA_DATABASENAME="postgres"

The above assumes branch name is main and the database name is postgres, but you should adjust to your case.

Step 2: Create task

Assuming you saved the spec file above as xata-clone-crontask.yaml, you can create it with:

kubectl apply -f xata-clone-crontask.yaml

Step 3: Test the task

Create a test job manually based on the task:

kubectl create job -n xata-clone --from=cronjob/xata-clone xata-clone-manual

And watch the logs:

kubectl logs -l job-name=xata-clone-manual -f -n xata-clone

If there are any errors, you can delete the whole task (kubectl delete -f xata-clone-crontask.yaml), fix the issue, and repeat Step 2.