I have a temporal go application/REST API that kicks off a workflow when an endpoint is hit by calling SignalWithStartWorkflow. The reason for using this instead of ExecuteWorkflow is to ensure idempotency - we don’t want another workflow to be triggered, but just return the result from the previous workflow or in case the workflow is halted (waiting for a signal) this will resume it.
However I have noticed that in many occasions, it takes more than 30 seconds to go from WorkflowExecutionStarted to WorkflowTaskStarted which is not desirable as the REST endpoint times out in 30 seconds. Having said that this never happens in workflows triggered via ExecuteWorkflow.
Would help to see your JSON history tctl wf show -w <wfid> -r <runid> --of myhistory.json
However I have noticed that in many occasions…
Does this typically happen when your workers are experiencing high load or just randomly? Do you have SDK metrics set up and can look at workflow task latencies (see here for more info).
The environment that I’m testing this doe not have a large volume of requests. However, there could be lots of running wokflows. I’m going to try and nuke the database to see if this still keeps on happening.
I’m not quite able to share the json data coz it contains company information.
Oops. A little late for that now. Team nuked the temporal database and now the issue is not longer present (at least for now). I’ll reach out again if this happens.
Plus assuming this is load related, would it help if I move this workflow into a different task queue? As of now, all workflow types are using the same queue.