Session retry on worker restart

We have a workflow that uses session
When the worker restart for any reason (new deployment for example) we need to retry the whole workflow again on any available worker than the one that died/restarted

Instead, we see the workflow is canceled.
I could reproduce the issue with the FileProcessing example, I’ve forked here with only one change (to run 1 iteration instead of 5 to make the issue simpler to find)

I’ve ran the sample > restarted the worker while workflow was in-progress > got these errors when the worker started again

What should be the correct configuration to let the session be retried on any available worker when the original worker dies/restarts?

16:8:58 app         | 2022/10/20 16:08:58 INFO  Starting session worker Namespace default TaskQueue fileprocessing WorkerID 67789@Ahmeds-MacBook-Pro.local@

16:8:58 app         | 2022/10/20 16:08:58 INFO  Started Worker Namespace default TaskQueue fileprocessing WorkerID 67789@Ahmeds-MacBook-Pro.local@

16:9:13 app         | 2022/10/20 16:09:13 DEBUG Session failed Namespace default TaskQueue fileprocessing WorkerID 67789@Ahmeds-MacBook-Pro.local@ WorkflowType SampleFileProcessingWorkflow WorkflowID fileprocessing_cb880727-8112-492e-b998-b76eaaa504ee RunID 8d5159a0-1f1f-413a-bfc3-b362d3233891 Attempt 1 sessionID f70a2544-1e3b-42dc-89fe-447e9027a759 Error activity error (type: internalSessionCreationActivity, scheduledEventID: 6, startedEventID: 18, identity: ): activity Heartbeat timeout (type: Heartbeat)

2022/10/20 16:09:13 DEBUG RequestCancelActivity Namespace default TaskQueue fileprocessing WorkerID 67789@Ahmeds-MacBook-Pro.local@ WorkflowType SampleFileProcessingWorkflow WorkflowID fileprocessing_cb880727-8112-492e-b998-b76eaaa504ee RunID 8d5159a0-1f1f-413a-bfc3-b362d3233891 Attempt 1 ActivityID 17

2022/10/20 16:09:13 ERROR Workflow failed. Namespace default TaskQueue fileprocessing WorkerID 67789@Ahmeds-MacBook-Pro.local@ WorkflowType SampleFileProcessingWorkflow WorkflowID fileprocessing_cb880727-8112-492e-b998-b76eaaa504ee RunID 8d5159a0-1f1f-413a-bfc3-b362d3233891 Attempt 1 Error canceled```
1 Like

Believe there is related feature request open here to add ability to keep session open on worker restart.

Haven’t had the chance to run your sample yet but will do and report back.