GKE Autopilot + Elastic Cloud

Hi everyone!
I was able to deploy temporal server to GKE Autopilot and it looks like it works but for some reason if I set ENABLE_ES=“true” it works opposite way I cannot see any workflows nor in Temporal or even ES index itself but if I will set ENABLE_ES=“false” it works as expected. I can see my workflows info in Temporal UI and I can see data in index too.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: lm-temporal
  labels:
    app: lm-temporal
    tier: backend
spec:
  selector:
    matchLabels:
      app: lm-temporal
      tier: backend
  replicas: 1
  template:
    metadata:
      labels:
        app: lm-temporal
        tier: backend
    spec:
      serviceAccountName: k8s-cloud-sql-proxy-sa
      containers:
        - name: cloud-sql-proxy
          securityContext:
            # The default Cloud SQL proxy image runs as the
            # "nonroot" user and group (uid: 65532) by default.
            runAsNonRoot: true
          resources:
            limits:
              memory: "200Mi"
              cpu: "250m"
          lifecycle:
            preStop:
              exec:
                command: [ "sleep", "20" ]
          image: gcr.io/cloudsql-docker/gce-proxy:1.33.1
          command:
            - "/cloud_sql_proxy"
            - "-instances=hidden_secret=tcp:5432"
        - name: lm-temporal-app
          resources:
            limits:
              memory: "1024Mi"
              cpu: "800m"
          image: temporalio/auto-setup:1.18.0
          ports:
            - name: rpc
              containerPort: 7233
              protocol: TCP
          livenessProbe:
            initialDelaySeconds: 150
            tcpSocket:
              port: rpc
          env:
            - name: DB
              value: postgresql
            - name: DB_PORT
              value: "5432"
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                    name: lm-temporal-config-secret
                    key: postgres.user
            - name: POSTGRES_PWD
              valueFrom:
                secretKeyRef:
                  name: lm-temporal-config-secret
                  key: postgres.password
            - name: POSTGRES_SEEDS
              value: localhost
            - name: ENABLE_ES
              value: "false"
            - name: ES_SCHEME
              value: https
            - name: ES_SEEDS
              value: hidden_secret.cloud.es.io
            - name: ES_PORT
              value: "443"
            - name: ES_VERSION
              value: "v8"
            - name: ES_VIS_INDEX
              value: temporal_visibility_v1_dev
            - name: ES_USER
              valueFrom:
                secretKeyRef:
                  name: lm-temporal-es-secret
                  key: elastic.user
            - name: ES_PWD
              valueFrom:
                secretKeyRef:
                  name: lm-temporal-es-secret
                  key: elastic.password

Here is my config.
Unfortunately, I wasn’t able to use helm chart but I do believe I set my settings correctly.

The question is why ES integration works only with ‘ENABLE_ES=false’ do I miss something?

The question is why ES integration works only with ‘ENABLE_ES=false’ do I miss something?

Here is the default docker template from which your server config is created (for all service roles since you are using auto-setup image). You can also bash into your pod and see parsed config at /etc/temporal/config/docker.yaml.

Where is ES deployed? My guess is some issue with setup/connecting to ES (check ES_SEEDS and if you need to set up username/password).
Here is the auto-setup script that sets up ES indexes if that helps.
If you disable ENABLE_ES temporal still creates the temporal_visibility db which seems to work fine in your case (so you use standard and not advanced visibility).

1 Like

Thank you man! You helped a lot. The problem was with this parameter:

- name: ES_VERSION
  value: "v8"

Apparently, you do not have a separate index_template for the ElasticSearch version 8. I changed to “v7” and it works now!