History instance CPU cannot scale up

ridwan.santoso · June 23, 2022, 2:32am

Hi All,

Got a scale up issue here with history instance, using the 2 history instances (while FE, matching, worker are all 1 instance) the CPU utilization is around 3 CPU each (total around 6 CPU), but when increase to 3 history instance the total CPU utilization is still around 6 CPU (so around 2 CPU for each history instance). The temporal instances are running in docker on VM so there should not be limit to CPU (can max out to host VM which is still plenty).

What are the possible cause the total History CPU cannot be scale up? Tried to increase history shard from 4096 to 8192 does not have much effect. Also increase the partition from 8 to 16 even makes the performance worse. Also tried to increase workflow workers and pollers also not really help. The DB is cassandra and don’t suspect the bottleneck is in cassandra.

Any thoughts?

Thanks
-ridwan-

ridwan.santoso · June 23, 2022, 2:46am

adding monitoring metric here

ridwan.santoso · June 23, 2022, 3:17am

there are delays in activity started and child flow execution completed:

tihomir · June 23, 2022, 2:52pm

Your sync match rate should ideally be over 99%, the dip shown is concerning. This typically indicates you need more workers (increase capacity), see worker tuning guide here.

On the SDK metrics side did you have a chance to look at workflow_task_schedule_to_start_latency and activity_schedule_to_start_latency metrics during this time?

Also on persistence latencies can you focus on operations CreateWorkflowExecution, UpdateWorkflowExecution and UpdateShard, little hard to see from picture

Topic		Replies	Views
History Service CPU usage Community Support	5	964	March 26, 2021
High DB CPU utilization with sub optimal number of history shards? Community Support aurora	0	271	October 16, 2023
Errors in frontend and hisotry Community Support history , worker , grpc	26	2350	April 9, 2022
Suggestions to increase worker throughput Community Support	7	2040	December 10, 2020
Bottleneck at scaling Temporal server Community Support mysql , performance	1	128	March 11, 2025

History instance CPU cannot scale up

Related topics