Guidance on using SQL-based Membership vs Load Balancers for Multi-fleet temporal deployment

Hi Temporal Community,

I am planning a production Temporal setup in one region and I’d appreciate some guidance on service discovery best practices - specifically around SQL-based persistence membership vs using load balancers or DNS.

Deployment overview:

  • I am designing for 4 separate fleets, one for each core Temporal microservice: Frontend, History, Matching and Worker.
  • Each fleet spans multiple Availability Domains within the region, and each AD contains a group of hosts running that service(eg. the “history fleet” runs only History nodes)
    I have come across SQL-based persistence membership feature that replaces ringpop(which is being deprecated) for service discovery.

Questions:

  1. When using SQL-based membership across multiple fleets, is it still necesarry to use load balancers for service discovery and communication, or can services rely solely on the membership DB?
  2. How does Temporal handle failover or service discovery if the membership DB becomes temporarily unavailable?
  3. What is the recommended approach for configuring service endpoints in the config_template.yaml in a multi-fleet environment using SQL membership? In a multi-fleet setup using SQL membership, what should be used in the rpcAddress fields for each service - IPs, DNS name or LB endpoints?
  4. More generally, which method-SQL-based membership or load balancers is recommended by the temporal team for multi-fleet or mult-region deployments?
  5. Are there official docs for best practices you can point me to for designing multi-fleet Temporal clusters with SQL membership?

Thanks

@maxim / @tihomir any chance you could help me here? Thanks!