We have recently set up our Temporal clusters and are successfully running some production workload on them. Unfortunately, we might have set numHistoryShards for those clusters too high and are plagued with heavy DB load that doesn’t match the scale of workflows we run.
Given that numHistoryShards cannot be changed after the cluster is deployed, we’ll have to create a new cluster and somehow migrate to it. I wonder if there are some hints or best practices for that? Ideally, we’d like to make the migration without a downtime, but we can also temporarily pause all workflows/signals if that helps the migration. Is there a way to migrate to a different cluster without having to run both side-by-side for a prolonged period of time?
If the cluster is sitting idle then even having large number of shards should not result eating up capacity of your database. Can you provide more information about your setup and which queries are taking up your capacity?
We run Temporal server 1.9.2 (have not yet updated to 1.10).
Our staging setup runs on AWS Aurora MySQL, db.t3.medium instance. It mostly sits idle (we used to feed some of the production events there, so it has a bunch of idle workflows sitting there, but now it rarely gets 1 event per minute, most often even less. Regardless of that, the DB CPU is at 30% now.
Our prod cluster is again AWS Aurora MySQL, db.r6g.large instances. It gets ~100 events per minute. The DB CPU is at 75% now. The workflow that we run there is effectively an implementation of the “abandoned cart” workflow. Most of the workflows there should have a pretty small history but there are also some workflows that have several thousand events in their histories.
I think the instances you are using are quite small to run with 8k shards. For your workload and instance sizes you should provision the cluster with a much smaller number of shards, like 128.
There are four background operations per shard which might result in a query to a database with some frequency even if the cluster is sitting Idle:
TransferTaskProcessor: We perform a read to retrieve any transfer tasks from the database. Even if the cluster is sitting Idle, a read is issued every 1 minute. This behavior can be changed through TransferProcessorMaxPollInterval dynamic config knob.
TimerTaskProcessor: We perform a read to retrieve any timer tasks from the database. Even if the cluster is sitting Idle, a read is issued every 5 minute. This behavior can be changed through TimerProcessorMaxPollInterval dynamic config knob.
VisibilityTaskProcessor: We perform a read to retrieve any visibility tasks from the database. Even if the cluster is sitting Idle, a read is issued every 1 minute. This behavior can be changed through VisibilityProcessorMaxPollInterval dynamic config knob.
UpdateShard: This one I’m not sure but I think we also force an update to ShardInfo every 5 minute. This behavior can be changed through ShardUpdateMinInterval dynamic config knob.
Thank you, @samar ! I’ll look up the load profile on those dbs and will share here. What would be the simplest way to migrate to smaller number of shards? Can existing workflows be somehow migrated?
Unfortunately we don’t support updating the number of shards once the cluster is provisioned. You have to standup a new cluster and keep the old one running until all workflow executions are drained.
You might need some logic on workers polling old clusters to StartWorkflowExecution from an activity instead of calling ContinueAsNew. Just a heads up, this is little unsafe as calling StartWorkflowExecution from an activity does not have the same transactional guarantees as ContinueAsNew.