Hi Temporal Community,
I am planning a production Temporal setup in one region and I’d appreciate some guidance on service discovery best practices - specifically around SQL-based persistence membership vs using load balancers or DNS.
Deployment overview:
- I am designing for 4 separate fleets, one for each core Temporal microservice: Frontend, History, Matching and Worker.
- Each fleet spans multiple Availability Domains within the region, and each AD contains a group of hosts running that service(eg. the “history fleet” runs only History nodes)
I have come across SQL-based persistence membership feature that replaces ringpop(which is being deprecated) for service discovery.
Questions:
- When using SQL-based membership across multiple fleets, is it still necesarry to use load balancers for service discovery and communication, or can services rely solely on the membership DB?
- How does Temporal handle failover or service discovery if the membership DB becomes temporarily unavailable?
- What is the recommended approach for configuring service endpoints in the config_template.yaml in a multi-fleet environment using SQL membership? In a multi-fleet setup using SQL membership, what should be used in the rpcAddress fields for each service - IPs, DNS name or LB endpoints?
- More generally, which method-SQL-based membership or load balancers is recommended by the temporal team for multi-fleet or mult-region deployments?
- Are there official docs for best practices you can point me to for designing multi-fleet Temporal clusters with SQL membership?
Thanks