I didn’t setup the Activity Timeout so it defaulted to 10 seconds. My Activity ran longer than that and I had this error:
2022/07/09 15:50:13 INFO Task processing failed with error Namespace default TaskQueue DataTQ WorkerID Data Worker #9914@DeliciousBacon WorkerType WorkflowWorker Error Workflow task not found.
2022/07/09 15:50:29 ERROR Workflow panic Namespace default TaskQueue DataTQ WorkerID Data Worker #9914@DeliciousBacon WorkflowType DataWorkflow WorkflowID data#0 RunID f74db9ce-4964-46d1-9f9e-a2aaa3a348fd Attempt 2 Error Potential deadlock detected: workflow goroutine "root" didn't yield for over a second StackTrace process event for DataTQ [panic]:
go.temporal.io/sdk/internal.(*coroutineState).call(0xc00057f680, 0x2540be400)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_workflow.go:944 +0x1bd
go.temporal.io/sdk/internal.(*dispatcherImpl).ExecuteUntilAllBlocked(0xc000491620, 0x11211e0?)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_workflow.go:1033 +0x1a5
go.temporal.io/sdk/internal.executeDispatcher({0x1506b50, 0xc000491680}, {0x1507ef8, 0xc000491620}, 0x0?)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_workflow.go:606 +0x9f
go.temporal.io/sdk/internal.(*syncWorkflowDefinition).OnWorkflowTaskStarted(0xc0003fcea0?, 0xc00024a900?)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_workflow.go:576 +0x32
go.temporal.io/sdk/internal.(*workflowExecutionEventHandlerImpl).ProcessEvent(0xc00001cf78, 0xc00024ad80, 0x60?, 0x1)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_event_handlers.go:827 +0x203
go.temporal.io/sdk/internal.(*workflowExecutionContextImpl).ProcessWorkflowTask(0xc009420080, 0xc000488930)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_task_handlers.go:902 +0xd68
go.temporal.io/sdk/internal.(*workflowTaskHandlerImpl).ProcessWorkflowTask(0xc0003c8420, 0xc000488930, 0xc00010e300)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_task_handlers.go:749 +0x485
go.temporal.io/sdk/internal.(*workflowTaskPoller).processWorkflowTask(0xc0003c4a90, 0xc000488930)
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_task_pollers.go:284 +0x2cd
go.temporal.io/sdk/internal.(*workflowTaskPoller).ProcessTask(0xc0003c4a90, {0x10ecdc0?, 0xc000488930?})
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_task_pollers.go:255 +0x6c
go.temporal.io/sdk/internal.(*baseWorker).processTask(0xc0000d9040, {0x10ec980?, 0xc000202180})
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_worker_base.go:400 +0x167
created by go.temporal.io/sdk/internal.(*baseWorker).runTaskDispatcher
/home/user/go/pkg/mod/go.temporal.io/sdk@v1.15.0/internal/internal_worker_base.go:305 +0xb5
2022/07/09 15:50:29 WARN Failed to process workflow task. Namespace default TaskQueue DataTQ WorkerID Data Worker #9914@DeliciousBacon WorkflowType DataWorkflow WorkflowID data#0 RunID f74db9ce-4964-46d1-9f9e-a2aaa3a348fd Attempt 2 Error Potential deadlock detected: workflow goroutine "root" didn't yield for over a second
After I saw “Potential deadlock detected”, I stopped the worker, set the proper Timeout in workflow.WithActivityOptions
and restarted the worker, but now I get:
2022/07/09 15:57:43 DEBUG ExecuteActivity Namespace default TaskQueue DataTQ WorkerID Data Worker #12166@DeliciousBacon WorkflowType ProcessDataWorkflow WorkflowID data#0 RunID f74db9ce-4964-46d1-9f9e-a2aaa3a348fd Attempt 1 ActivityID 59 ActivityType BuildSaveDataActivity
2022/07/09 15:57:43 DEBUG Cached state staled, new task has unexpected events Namespace default TaskQueue DataTQ WorkerID Data Worker #12166@DeliciousBacon WorkflowID data#0 RunID f74db9ce-4964-46d1-9f9e-a2aaa3a348fd Attempt 22 CachedPreviousStartedEventID 57 TaskFirstEventID 1 TaskStartedEventID 57 PreviousStartedEventID 45
2022/07/09 15:57:43 INFO Task processing failed with error Namespace default TaskQueue DataTQ WorkerID Data Worker #12166@DeliciousBacon WorkerType WorkflowWorker Error Workflow task not found.
It keeps retrying and increasing the Attempt count, but it fails to recover. Is there any way to recover the Workflow from this?
Notes:
workflow.ActivityOptions{StartToCloseTimeout: time.Hour * 2562046}
worker.Options{DeadlockDetectionTimeout: time.Second * 60}
- go.temporal.io/api v1.8.0
- go.temporal.io/sdk v1.15.0