Multi-datacentre issue

We are running two Temporal clusters in separate datacentres backed by separate Postgres databases. Both Temporal and the consumer application are running in OpenShift. This ensures we can perform blue/green deployments and zero downtime upgrades.

User and webhook traffic is load balanced between our two datacentres but the load balancing makes resubmitting/resuming workflows unreliable because the traffic may get routed to a different DC than the DC where the workflow originated.

We’re trying to figure out how to deal with this situation without introducing a single point of failure by having a single Temporal cluster. Any suggestions would be appreciated.

You cannot have a workflow active simultaneously in 2 or more clusters.
With Temporal you would need a fully consistent DB (and fully consistent replication).
Would look at multi-cluster replication docs.