High schedule to start latency

Parth_Anand · June 14, 2022, 6:42pm

Deployment Architecture:
Worker processes: 14 pods with 13 GB memory/7 CPU cores
Temporal frontend: 8 pods with 1 GB memory/1 CPU cores
Temporal history: 25 pods with 2 GB memory/3 CPU cores
Temporal matching: 17 pods with 2 GB memory/3 CPU cores
Temporal workers: 20 pods with 2 GB memory/3 CPU cores.

We have configured workers with the following values:
MAX_CONCURRENT_ACTIVITY_POLLERS = 20
MAX_CONCURRENT_WORKFLOW_TASK_POLLERS = 20
MAX_WORKFLOW_THREAD_COUNT = 5000
MAX_CONCURRENT_ACTIVITY_EXECUTION_SIZE = 600
WORKFLOW_CACHE_SIZE = 5000
MAX_CONCURRENT_LOCAL_ACTIVITY_EXECUTION_SIZE = 600
MAX_CONCURRENT_WORKFLOW_TASK_EXECUTION_SIZE = 600

Problem:
There is a high schedule_to_start_latency when we have around 200 workflows getting created per second.
From metrics, we observe that a high number of task_slots are available.

What could be a probable reason behind this?

Parth_Anand · June 14, 2022, 7:29pm

For everyone’s reference, some useful articles for the same:

for task queue partitions, can be set via “matching.numTaskqueueWritePartitions” and “matching.numTaskqueueReadPartitions” in dynamic config.
Dynamic config knobs on server side: read write
default is 4 and that should do the work, if not try setting to 8.

Topic		Replies	Views
Workflow Performance with Java SDK Community Support java-sdk	1	783	February 20, 2023
Seeing high latencies between two subsequent activity task executions Community Support java-sdk , cassandra	22	3078	July 19, 2022
Workflow Task Schedule To Start Latency High Community Support java-sdk , deployment	11	4433	February 8, 2025
Temporal is slow to start burst of 1000s of workflows Server Deployment go-sdk	0	151	January 22, 2025
Very high Workflow Task Schedule To Start Latency Community Support	0	341	July 13, 2024

High schedule to start latency

Related topics