HA of Temporal Server(Cluster)

Hi experts,
I am trying to setup a temporal environment on k8s with high availability.
On the offical document, I see we recommend a cluster deployment through helm chart. But that seems a little bit complicated for me. We don’t need to setup different numbers of front-ends or workers for now.
In a simplest manner, can I achieve that just by running multiple replicas of the temporal server?
is there any risk for temporal synchronization?
Here is my k8s manifest for temporal server:

apiVersion: apps/v1
kind: Deployment
  name: temporal-server-deployment
  replicas:2 # ??? Does this work for HA or is there any risk for temporal synchronization???
      app: temporal-server
        app: temporal-server
        - name: temporal-server
          image: temporalio/auto-setup:1.14.0
            - name: LANG
              value: "en_US.UTF-8"
            - name: DB
              value: postgresql
            - name: DB_PORT
              value: "5432"
            - name: POSTGRES_USER
              value: postgres
            - name: POSTGRES_PWD
              value: postgres
            - name: POSTGRES_SEEDS
            - name: DYNAMIC_CONFIG_FILE_PATH
              value: config/dynamicconfig/development.yaml
            - name: PROMETHEUS_ENDPOINT
            - containerPort: 7233
            - containerPort: 8000
            - name: config
              mountPath: /etc/temporal/config/dynamicconfig

              cpu: 100m
        - name: config
            name: temporal-server-configmap
              - key: development.yaml
                path: development.yaml

BTW, if this is not a good practise, can anyone share how to generated the k8s manifests for each services?

image: temporalio/auto-setup:1.14.0

Using auto-setup image for prod clusters is typically not recommended but if you have a single node cluster it should be ok for small scale app.
See also relevant forum post here.

Have you tried using replicas:2 and did you run into any issues? Not sure if this type of deployment would run into issue with cross-pod communications. Worth giving it a try.

I tried it and no issues found for now.
There is one thing I want to confirm from the implementation mechanism :
no matter a workflow runs on which server, when I query the workflow list from web, we can always get it, right?
I think the scenario is the same in a typical temporal cluster which has different number of internal services, for example, 2 frontends, 3 instances each for matching, history, and worker.

Temporal relies on the db being fully consistent in all failure scenarios. I believe what you said is correct if all your replicas are configured to a single fully consistent db (and visibility db).

Hi! just wanted to followup on this point in particular. In a production environment we do recommend that the individual services be run in their own binary. there shouldn’t be any risk to data integrity, but you may have some negative performance impact from doing this. that said, I haven’t done much experimenting with this and I’m interested in your experience if you try it and you are willing to share!

To run services in their own binary, even though you don’t want to use helm, you can generate the manifests using helm template and then just pair them down to what you want and use those directly.

One more comment on production in general - you will not want to use auto-setup but instead should use the non-auto images and the temporal-*-tool database tools we provide to create and update schema.

Wish I can share our experience in a near future. Our system is still under development for now. Our product is a data analyze system, which serves for individual organizations. Requests and workflows numbers are not so big as SAAS platforms. While, ease of deployment means much for us. That is why I choose such a solution.