Proper technique for worker resilience when the Temporal Server is itself unavailable

expertonium · October 25, 2023, 2:51pm

I note that my workers, written with the Go SDK, exit if they can’t successfully dial the Temporal Server. In terms of the docker process, that container becomes Exited.

My essential technique in creating worker resilience, then, is a restart policy on the worker container. The Temporal Server comes up asynchronously, and, once it does, the worker dials successfully. Workflows resume.

I wonder if I’m failing to grasp an aspect of the Worker connection options in terms of the Go SDK? Is there a means “built-in” that prevents worker death, of itself, for some period of time, in the event of an unsuccessful dial to the Temporal Server? The KeepAlive ConnectionOptions, if those are appropriate to this, admittedly confuse me.

expertonium · October 25, 2023, 4:25pm

For the record, I’m padding it within the worker now in the following way. The worker is still going to die and be restarted, but this particular example gives the worker 2 minutes of attempts, in 5 second intervals.

Still wondering if I’m off-base. Cheers.

func main() {

	c, err := temporalClientRetry("127.0.0.1:7233", 24, time.Second*5)

	if err != nil {
		log.Fatalln("unable to create Temporal client", err)
	}

	defer c.Close()

	// other code...
}

func temporalClientRetry(hostPort string, attempts int, dur time.Duration) (client.Client, error) {
	c, err := client.Dial(client.Options{
		HostPort: hostPort,
	})
	if err != nil {
		if attempts--; attempts > 0 {
			log.Printf("temporal client failed, attempts remaining %d", attempts)
			time.Sleep(dur)
			return temporalClientRetry(hostPort, attempts, dur)
		}
		return nil, err
	}
	return c, nil
}

Topic		Replies	Views
Connection failure Community Support go-sdk	1	1257	October 15, 2021
Worker can't connect to Temporal Community Support go-sdk	3	7572	November 15, 2022
Temporal worker can not connect to temporal server Server Deployment general-impl	7	3832	October 11, 2023
Temporal Pod Abruptly Restarted Community Support go-sdk , kubernetes	8	46	February 12, 2025
Temporal worker update process recommendation Community Support go-sdk	1	620	March 7, 2024

Proper technique for worker resilience when the Temporal Server is itself unavailable

Related topics