Communication between multiple instances of temporal server

Is there any requirement that multiple instances of temporal need to be communicated, or they are completely standalone.

1 Like

Can you clarify what you mean by “communicated”. In general the temporal server can scale horizontally when needed.

Temporal uses various sharding techniques to ensure high scalability and it requires a point to point communication between different nodes that comprise the service.

My question is when we scale temporal server horizontally, does it require point to point communication between different instances of the server (I understand that different nodes - frontend, history, matching etc., require that communication) assuming we are deploying temporal server(frontend+history+matching service etc., ) as a single deployment/container.

1 Like

I think I don’t understand you question. What is “server” in your terminology?

I assumed that Temporal service (cluster) is composed from multiple servers. But it looks like you are asking about something else.

Sorry for the confusion.
In my terminology, Server is a single container in which different temporal services (front-end, history, matching service etc., ) are running and only front-end is exposed. (Similar to temporal container that is mentioned in docker-compose file)

My question is

  1. Let’s say If I bring up two such containers, do they need to communicate ?
  2. Is it a valid deployment strategy or I got it completely wrong ?

I see. Thanks for the clarification.

  1. Yes, they need to communicate. From the temporal services point of view each process is independent and is part of the larger cluster. And it doesn’t change if some of them are packaged in a single container.
  2. For production deployments we recommend running each role separately. They have different scale out requirements and troubleshooting is also simpler.

Thanks @maxim. This is helpful.
One follow-up question, Is the communication required only between front-end services or is it required between all the services (history, matching, worker etc). I’m asking this because based on this we may be required to expose multiple ports from that single container.

Small suggestion: It would be very helpful if there is document which explains the role of each service and how they communicate internally and how to configure that. Pardon me if there is already one.

1 Like

The communication is required between all the services. And if you are running multiple roles in one container then multiple ports should be open.

Unfortunately such documentation doesn’t exist yet. We are working on it.

2 Likes

Thanks for confirming.

  1. There are around 12 ports that are exposed in Dockerfile, Do we need to expose all of them or is it sufficient to expose only gRPC/membership ports ?
  2. Which ports need to be exposed for Cross-DC Replication ?
1 Like

Temporal server consists of following 4 roles and the ports needed for those roles for communication.

  1. Frontend: This role is responsible for all in-bound API calls, including cross-dc related API which is invoked by remote cluster. It uses the grpcPort 7233 to host service handler and uses port 6933 for membership related communication with other hosts.
  2. History: Internal Temporal role to manage workflow state transitions. It uses grpcPort 7234 to host service handler and uses port 6934 for membership related communication.
  3. Matching: Internal Temporal role responsible for hosting TaskQueues for task dispatch. It uses grpcPort 7235 to host service handler and uses port 6935 for membership related communication.
  4. Worker: Internal Temporal role which runs background processing for replication queue, kafka visibility processor, and system workflows. It uses grpcPort 7239 to host service handler and uses port 6939 for membership related communication.

Looks like the following 4 ports in our Dockerfile are unnecessary and probably leftover from Cadence:

7933
7934
7935
7939
4 Likes

Thanks @samar for your detailed explanation.

@samar can you point me to a document of how to setup cross-dc fault tolerant setup. (helm version)

There is no helm for cross-dc yet.

will there be one at the time of officially release @maxim
even the helm repository has a disclaimer that helms are for demo only. Is there any plan to add an official production ready helm (need not be cross dc though)?

Hey @madhu,
It’s not even clear what it means to have a high availability setup support for Temporal using Helm. If you have specific requirements, then please file a proposal and we can evaluate to have this support added to helm in the future.
We do plan to support production ready helm charts for Temporal without any CrossDC support with our V1 release.

1 Like

By HA all i am looking for is multiple instance/copies of worker/server/web with in a dc/cluster. and if any additional configurations are required to enable those (say port/protocol etc). Looks like that does is not the case.

In terms of Helm all i was looking for is a product ready helm chart. Thanks for confirming that one such chart will be available around the GA.

Hey @madhu,
Based on your definition of HA it is already supported by our existing Helm chart. You can have any number of instances running with various roles (Frontend/History/Matching/Worker/etc) within a cluster setup for redundancy and it should not require any additional configuration.

Thanks @samar.

Apart from
a) multiple instance/copies of server,web, etc with in a cluster
b) cross dc, cross cluster setup for redundancy

do you think there can be a 3rd alternative, if so in which circumstances do they make sense?

@samar , Greetings
Correct me if Iam wrong in below 2 points:
1: when using the k8s approach the HA can be assumed to be managed by the api-server isnt it?
2: secondly, for a setup in which I have temporal k8s components in 2 regions (cross-dc) (suppose AWS) , Is it possible for me to switch all my connections from the DBs from region 1 to region 2 without downtime during an unexpected shutdown in region 1?