UpsertSearchAttributes does not update attribute if workflow panics immediately afterwards

Hi,

I have a situation where my workflow panics immediately after a call to UpsertSearchAttributes. The panic is related to my workflow code and not to UpsertSearchAttributes itself.

I see that in this case, the search attribute doesn’t get updated to the latest value.

I looked at the go sample code for search attributes (samples-go/searchattributes_workflow.go at main · temporalio/samples-go · GitHub) and in that, there is an explicit sleep on line 95 with a comment which seems to indicate that the sleep is necessary for the upsert to be sent to the server and for the elasticsearch index to be updated.

Could somebody shed more light on this? Is sleep required everytime UpsertSearchAttributes is called? If so, what should the duration of the sleep be?

When workflow panics its progress is blocked. So any commands it emitted as part of the workflow task that caused the panic are ignored. Fix the panic and workflow will continue the execution.

Hi @maxim,

Sorry, I should have mentioned that the panic is something we will fix. I was mostly trying to understand why adding a sleep before the point of panic makes everything work correctly (I tried this and the upserted attribute does contain the latest value) whereas without the sleep, it doesn’t. Is there some sort of buffering/queueing that happens which is somehow flushed completely when there is a sleep?

It does not have to be just Workflow.sleep, it could be an activity/child workflow invocation, etc. Anything that completes the workflow task, and sends the accumulated commands to the server. In your case the workflow task never completes (because of the panic) and the commands (to upsert search attribute in your case) are not sent.

Just as an example, let’s say that your workflow code executes an activity and sleeps for 2 seconds:

func MyWorkflow(ctx workflow.Context) (string, error) {
	var a *Activities
	var result string
	err := workflow.ExecuteActivity(ctx, a.doXYZ).Get(ctx, &result)
	workflow.Sleep(ctx, 2*time.Second)
	return result, err
}

In your workflow history you should see events:

  1. WorkflowExecutionStarted (client has started invocation for this workflow)
  2. WorkflowTaskScheduled (exec of workflow method scheduled)
  3. WorkflowTaskStarted (exec of workflow method started)
  4. WorkflowTaskCompleted (workflow task completed - we are requesting from server to start scheduling our doXYZ activity invocation
  5. ActivityTaskScheduled (this is the command sent to the server, note there could be multiple commands send back depending on your workflow code)
  6. ActivityTaskStarted (execution of our activity has started)
  7. ActivityTaskCompleted (execution of activity has completed, it also includes activity result which is applied to our “result” var)
  8. WorkflowTaskScheduled (we are continuing our workflow method execution)
  9. WorkflowTaskStarted
  10. WorkflowTaskCompleted (workflow task completed because we need server to create our timer for workflow.sleep)
  11. TimerStarted (this is the command again to start our timer)
  12. TimerFired (timer fired…we can continue our workflow exec, a new workflow task is created)
  13. WorkflowTaskScheduled
  14. WorkflowTaskStarted (we continue our workflow method execution, there is just the return left)
  15. WorkflowTaskCompleted
  16. WorkflowExecutionCompleted (workflow execution completed)

In your case because of the panic, the workflow task never completes, so you never send the commands back to the server to upsert. Once the panic is fixed, the workflow task can complete and commands can be send, which is what Maxim mentioned in previous response.

So looking at the workflow history, after a “WorkflowTaskCompleted” you can see all the commands (if any) that are sent.

Hope this helps.

2 Likes

Thank you so much @tihomir . That was really detailed and helpful.