Hello, I’m looking forward to running temporal in a production-ready way.
What we’ve done
In the initial experiment, we currently run temporal using docker-compose and each major component corresponds to a container
What we want
Running on a single host, things are working fine. Now we start to think about HA. Ideally, we want such kind of scenario: suppose the active temporal server is running on host A and there are some workflows currently running in the middle. Now host A is shut down. The previous standby temporal server running on host B can pick up the unfinished workflows and resume them seamlessly. For some reason, we don’t want to run temporal on top of K8s.
What we think
To accomplish this goal, we are thinking there are a few points that need to be ensured
- We need to replicate the state of the temporal-postgres container. Containers running on different hosts should have eventual consistent copies of data.
- We deploy multiple copies of temporal services on different hosts. Those copies have to be aware of each other and establish leadership and membership so that only one of them is active at a time ==> Does temporal provide this out-of-box?
Please advise if we are on the right track and if there is anything that we need to be mindful of. Also, does Temporal provide any documentation we can follow for such kind of scaled deployment?
I know I’ve asked lots of questions but I would really appreciate your help