Setting up a server cluster

Hi,

I’d like to set up a server cluster of two temporal server applications.
What I mean by one “server” is an application, that runs the frontend, the matching service, the history service and the worker service. (I know that it’s best practice to have separate applications for each service, but for now this is what our limitations allow).

In the meantime we’ll also have two workers. These workers should connect to the cluser with client side load balancing. If I understand it correctly, if we put both “servers” behind the same dns, it should be fine.

I have a few question/doubts for this setup:
1, How can I configure that the two “servers” see and connect to each other, so they form a cluster? It might be a trivial question but I haven’t found a good example.

2, Is it possible that two “servers” use the same databases? (default and visibility). I know it’s not ideal, but for the time being we use Postgres.

3, In the workflows we use activities that are sending RabbitMQ messages, use doNotCompleteOnReturn() and get finished after the response arrives in another RabbitMQ message. My fear is, that RabbitMQ also does a form of “load balancing”, and the worker instance that starts the activity and the instance that try to finish it might not be the same. Is this going to cause any issues?

Thanks

1, yes, in fact this is my current deployment , you can see bellow tctl adm cl d show, each component(frontend, matching…) have 2 members behind.

❯ tctl  adm cl d
{
  "supportedClients": {
    "temporal-cli": "\u003c2.0.0",
    "temporal-go": "\u003c2.0.0",
    "temporal-java": "\u003c2.0.0",
    "temporal-server": "\u003c2.0.0"
  },
  "serverVersion": "1.9.0",
  "membershipInfo": {
    "currentHost": {
      "identity": "192.168.44.10:7233"
    },
    "reachableMembers": [
      "192.168.44.10:6933",
      "192.168.127.18:6933",
      "192.168.44.10:6934",
      "192.168.127.18:6934",
      "192.168.127.18:6939",
      "192.168.44.10:6935",
      "192.168.127.18:6935",
      "192.168.44.10:6939"
    ],
    "rings": [
      {
        "role": "frontend",
        "memberCount": 2,
        "members": [
          {
            "identity": "192.168.44.10:7233"
          },
          {
            "identity": "192.168.127.18:7233"
          }
        ]
      },
      {
        "role": "history",
        "memberCount": 2,
        "members": [
          {
            "identity": "192.168.127.18:7234"
          },
          {
            "identity": "192.168.44.10:7234"
          }
        ]
      },
      {
        "role": "matching",
        "memberCount": 2,
        "members": [
          {
            "identity": "192.168.127.18:7235"
          },
          {
            "identity": "192.168.44.10:7235"
          }
        ]
      },
      {
        "role": "worker",
        "memberCount": 2,
        "members": [
          {
            "identity": "192.168.44.10:7239"
          },
          {
            "identity": "192.168.127.18:7239"
          }
        ]
      }
    ]
  }

  1. yes, I think is a must use same database. I’m using mysql.

Thanks for the reply! How does your config yaml look like? Or where do you specify the list of the servers? We can just put two server instances behind a load balancer, but I guess this is not enough for them to connect to each other. Or is it?

Here is the yaml file for Node A,
Node B has exactly same configuration file, except the ip address (192.168.127.18) inside is another ip.

there is an load balancer in front of these 2 nodes, load for port 7233 of nodeAB

persistence:
  defaultStore: mysql-default
  visibilityStore: mysql-visibility
  advancedVisibilityStore: es-visibility
  numHistoryShards: 512
  datastores:
    mysql-default:
      sql:
        pluginName: "mysql"
        databaseName: "temporal"
        connectAddr: "blabla:3306"
        connectProtocol: "tcp"
        user: "temporal"
        password: "blabla"
        maxConns: 20
        maxIdleConns: 20
        maxConnLifetime: "1h"
    mysql-visibility:
      sql:
        pluginName: "mysql"
        databaseName: "temporal_visibility"
        connectAddr: "blabla:3306"
        connectProtocol: "tcp"
        user: "temporal"
        password: "blabla"
        maxConns: 2
        maxIdleConns: 2
        maxConnLifetime: "1h"
    es-visibility:
      elasticsearch:
        version: "v7"
        logLevel: "info"
        url:
          scheme: "http"
          host: "blabla:9200"
        indices:
          visibility: "temporal-visibility-dev"

global:
  membership:
    maxJoinDuration: 30s
    broadcastAddress: "192.168.127.18"
  pprof:
    port: 7936
  metrics:
    prometheus:
       timerType: "histogram"
       listenAddress: "192.168.127.18:8000"

services:
  frontend:
    rpc:
      grpcPort: 7233
      membershipPort: 6933
      bindOnIP: "192.168.127.18"

  matching:
    rpc:
      grpcPort: 7235
      membershipPort: 6935
      bindOnIP: "192.168.127.18"

  history:
    rpc:
      grpcPort: 7234
      membershipPort: 6934
      bindOnIP: "192.168.127.18"

  worker:
    rpc:
      grpcPort: 7239
      membershipPort: 6939
      bindOnIP: "192.168.127.18"

clusterMetadata:
  enableGlobalNamespace: false
  failoverVersionIncrement: 10
  masterClusterName: "active"
  currentClusterName: "active"
  clusterInformation:
    active:
      enabled: true
      initialFailoverVersion: 1
      rpcName: "frontend"
      rpcAddress: "192.168.127.18:7233"

dcRedirectionPolicy:
  policy: "noop"
  toDC: ""

archival:
  history:
    state: "enabled"
    enableRead: true
    provider:
      filestore:
        fileMode: "0666"
        dirMode: "0766"
      gstorage:
        credentialsPath: "/tmp/gcloud/keyfile.json"
  visibility:
    state: "enabled"
    enableRead: true
    provider:
      filestore:
        fileMode: "0666"
        dirMode: "0766"

namespaceDefaults:
  archival:
    history:
      state: "disabled"
      URI: "file:///tmp/temporal_archival/development"
    visibility:
      state: "disabled"
      URI: "file:///tmp/temporal_vis_archival/development"



publicClient:
  hostPort: "192.168.127.18:7233"

dynamicConfigClient:
  filepath: "./config/dynamicconfig/development_es.yaml"
  pollInterval: "10s"
1 Like