ResetStickyTaskQueue failed on Timed out Workflow

Hi,

We have a workflow A that is terminated due to being TimedOut instead of Completed.

However, it looks like this workflow A is not evicted from the Worker’s StickyTaskQueue upon being timed out, so when the worker receives a new workflow task and a task of this workflow A is evicted, it causes the following error.

msg="ResetStickyTaskQueue failed" Error="workflow execution already completed"

Any body knows how we can get rid of this error ? Thanks very much.

We are using

  • go-sdk v1.13
  • go temporal API v1.7.0
  • temporal server 1.13.1

We have a workflow A that is terminated due to being TimedOut instead of Completed.

Assume your workflow times out due to the set WorkflowExecution/Run timeout, let me know if thats the case.

workflow execution already completed

Check if you have pending activities when your workflow exec times out. This message can happen when your pending activity/activities complete (or need to be retried) after the workflow exec has already completed.

We are using go-sdk v1.13 go temporal API v1.7.0 temporal server 1.13.1

You are using a pretty old server version, would suggest updating. Are you using Temporal Go SDK 1.13? If so would look at updating this as well.

Could you show workflow history for one of these timed out workflow executions and full worker error please?

Assume your workflow times out due to the set WorkflowExecution/Run timeout, let me know if thats the case.

Yes, it is timed out due to reaching WorkflowExecutionTimedOut limit

Check if you have pending activities when your workflow exec times out. This message can happen when your pending activity/activities complete (or need to be retried) after the workflow exec has already completed.

My workflow is a for loop with a selector

for {
	selector := workflow.NewSelector(ctx)
	shouldStop := false

	selector.AddReceive(eventChannel, func(c workflow.ReceiveChannel, _ bool) {		
		// update shouldStop based on receivedEvent
		
	})

	selector.Select(ctx)

	if shouldStop {
		break
	}
}

The workflow cannot complete due to the for loop doesn’t break.

Thanks a lot @tihomir

Hi @tihomir Do you have any ideas why this happened? Is there a mismatch in the final status of TimeOut Workflow between the worker and Temporal server (the worker hasn’t marked the workflow as completed, but the server did)
Thanks a lot.

This is a benign message. It indicates that workflow is already closed when the worker reported to the server that it was pushed out of cache.

Thanks a lot @maxim

Yes… But I don’t understand why there is a mismatch in final status between workers and Temporal servers. This is mainly for timed-out Workflow due to the WorkflowExecutionTimedOut config.

The workflow is a for-loop waiting for incoming events until it meets the condition to break the loop.
Here, it hasn’t received the correct events yet reach the time-out limit, so it stops.

The timeout happens on the server. The worker is not notified about the workflow timeout. So it reports the eviction as usual.

1 Like

Thanks a lot @maxim .