QueryWorkflow performance is not good

naorm1991 · November 7, 2021, 10:25am

Currently we have many workflows running and we created QueryHandler in each one of them to show at what step they are.
We try to query them from our frontend service. However, it takes about 1 second to query each workflow which is way too slow (and because we have many workflows that we need to query, we try to use goroutines but it’s not great).
Is there anything we can do to improve QueryWorkflow performance ?
In the docs it says there is a mode of eventual consistency which may be faster but seems it was removed from the code some time ago

maxim · November 7, 2021, 8:37pm

This latency is way higher than expected. But I’m confused by the following statement:

and because we have many workflows that we need to query, we try to use goroutines but it’s not great

Usually, in UI you query a single workflow. Why do you need to query many workflows simultaneously from the UI? If you query each workflow in a list of workflows it is rarely a good idea.

Meli_Lee · March 31, 2022, 4:02am

Hi we faced the same problem,
but we don’t query it in a list, we just hit it one by one per workflow id.
the QueryWorkflow take a long time, and sometimes it’s getting timeout, and not provided any data in workflow that still running/terminated
Is there any advice how to improve the performance of query workflow?
Thank You

tihomir · March 31, 2022, 2:54pm

Are you able to query these workflows via tctl, for example:

tctl wf query -w <workflow_id> --qt <query_name>

maxim · March 31, 2022, 3:22pm

@Meli_Lee Do you use local activities? They can delay query responses.

Meli_Lee · April 1, 2022, 5:24am

We try to query it from the tctl it return,
but we also build an api that use QueryWorkflow, sometimes it get >5s

in the failed activity itself we only call http rest
err := workflow.ExecuteActivity(ctx, activity1, activityReq).Get(ctx, &activity1Res)
func activity1(activityReq string){
res,err:= // call http service.client.Post
}

from the workflow sequence, does it shown that we use local activity?
still trying to figure out if we use local activity.
Does local activity mean, we execute activity inside of the function activity?
ex
func activity1(activityReq string){
err:=workflow.ExecuteActivity(…).Get(ctx,…)
}

We build our customized retry handling,
When retry from the activity has reach max attempts, they would wait for signal retry,
ch := workflow.GetSignalChannel(ctx, RetryWorkflow)

will this impacted on the performance of QueryWorkflow?
additional info,
during an activity failure(got error timeout/etc), when we direct it to Signal,
then try to QueryWorkflow to check, it took 5s and the next direct hit, it seem the response got cache, it become faster like 0.2s
but when there is interval of 60s and we try to query again, it become 5s again

tihomir · April 1, 2022, 1:43pm

from the workflow sequence, does it shown that we use local activity?

It seems you are not using local activities, here is a sample in the go-samples repo. For local activities you would see a MarkerRecorded event type in your history with markerName prop set to “LocalActivity”.

tihomir · April 1, 2022, 1:58pm

We build our customized retry handling

Could you share your workflow code (and the GetRetrySignal function) , think it would help to see what could be going on (you can dm if you don’t want to share it it publicly, that’s fine).

Vikas_NS · April 1, 2022, 4:54pm

IMHO, latency of the query method is really due to the way it works.

It needs to pull the execution history from DB (if already not in memory)
Worker needs to replay these events.
Finally it returns the requested variable.

Also, as mentioned above, you need to have an active worker to have your query fullfilled, If you have too many workflows being executed, then your workers will be busy. Due to this, the delay in query response depends on how free the workers are.

I’d recommend to store the workflow step information in your own database and pull it out from there.

IMHO, Temporal query methods are to be used when the data you are pulling out is frequently updated and infrequently accessed. In that case writing it to DB would be an overkill and hence query methods are suitable.

Huong_Pham · April 4, 2022, 5:13am

Hi @Vikas_NS ,

What do you mean by
Worker needs to replay these events

Does it mean, the worker actually replays those events? Or Temporal stores the result of all activities and just replay with the stored output of each activity.

Thanks a lot.

Vikas_NS · April 4, 2022, 5:31am

Temporal stores the results of activities as Workflow Execution History and uses them to replay.

Say the information you are interested is a String of length 10. If you use a your own database, You will be writing a String of length 10 and reading a String of length 10.

But if you use temporal’s query method,
Temporal needs to bring in the whole execution history ( results of activities) (This is an unnecessary overhead - The size of data being pulled out from Temporal DB depends on your Workflow Size)
And then replay and then return your String of length 10.

Huong_Pham · April 4, 2022, 5:33am

Thanks a lot, Vikas.

Huong_Pham · April 4, 2022, 6:18am

Btw @Vikas_NS ,

What if the workflow logic was changed a lit bit, how does the replay process deal with this?

The reason I ask is because

I have a workflow and expose a query handler to check a variable in the Workflow. (indicating workflow status like processing/completed/failed)
Since the response time when using query handlers is not good, I plan to persist this variable to DB as suggested.

To do so, I have to either

1. Create a new activity to persist this variable to DB after each activity is completed.
1. Or update current activities to persist this variable.

However, during the transition, I need to support both workflows ( old and new ones). The old ones still use the Workflow variable to return workflow status, and the new ones use the variable from DB.

What could be a good way for us to support this?
Thanks a lot.

Vikas_NS · April 4, 2022, 6:29am

Temporal has the concept of versioning.
Here’s an excellent video on that - Move Fast WITHOUT Breaking Anything - Workflow Versioning with Temporal - YouTube

Your workflow code will house the versioning logic.
If the workflow version is old, it will use the old code.
Else , it will use the new code.

Huong_Pham · April 4, 2022, 6:31am

That’s great. Thanks a lot @Vikas_NS

tihomir · April 4, 2022, 1:22pm

Just wanted to add that Temporal workers cache workflow executions. Fetching full histories is only needed when a state is lost due to a process restart, or worker pushing a workflow execution out of cache in LRU manner. See here for more info and tuning.

Meli_Lee · April 5, 2022, 7:54am

thanks a lot @tihomir @Vikas_NS for the explanation
will look more into it

Topic		Replies	Views
How queries on completed workflows work Community Support	14	2255	April 20, 2025
Use cases for queries Community Support use-case-validation , performance , best-practices	5	7991	May 10, 2024
Getting error in workflow when waiting for 2 hours and querying at the same time Community Support java-sdk , mysql	7	2456	April 12, 2022
Workflow.query can query Worker different from the Worker executing activity Community Support python-sdk	10	587	January 23, 2024
Can Query be used to fetch the state of workflow in near real time? Community Support query	8	671	April 10, 2025

QueryWorkflow performance is not good

Related topics