We are considering a Temporal setup where we have a dedicated task queue per worker
Is there is a way to catch error of certain worker dies & reschedule the workflows that was executing to different task queue?
When rescheduling, we need the workflows to start from the first activity (even if some activities has been already executed on the dead worker)
Is there is an example, docs for such setup I can check?
Feel free to suggest a recommendation for achieving this setup
Can you explain your use case for this type of need? With Temporal a workflow execution is not tied to a specific worker. Typically you want to have multiple worker processes polling on same task queue (that have your wf impl registered) in order to achieve HA.
If your workflow execution has started on worker 1 and worker 1 crashes, it can continue to worker 2 that is listening on the same task queue.
Yes its about running workflow activities on the same worker, but we had issues with sessions when we deploy new versions to the worker
As graceful shutdown doesn’t wait for the the whole session to finish, Which resulted an unexpected behavior with some activities being canceled instead of being retried/continued on the the same worker (after deploying new version of the worker)
Does setting a custom identity for the worker deployed on the same machine helps? (we only deploy 1 worker per machine)
we had issues with sessions when we deploy new versions to the worker
What’s the Go SDK version you are using?
Can you give more info on the issue you ran into? Can you see if it might be related to this open issue? If not how hard would it be to create a reproducible test for it?