I recently deployed temporal connected to a new cassandra 3 node cluster in kubernetes. I set the numHistoryShards to 2048 in case we need them in future. I’ve noticed after deploying the services when there are no workers registered and no workflows to process that the history service is consistently sitting at around 20% - 30% cpu usage and that cassandra is also sitting around 20% - 30% cpu usage.
Is this the expected behaviour of temporal when there is nothing to process? Does the number of history shards affect the cpu usage?
numHistoryShards to 2048
^ just sanity check, numHistoryShards should never be changed after set
history service is consistently sitting at around 20% - 30% cpu usage and that cassandra is also sitting around 20% - 30% cpu usage.
^ this can be expected if number of history host is small and number of cass host is small
there are background maintenance logic running per shard.
if you can provide more metrics, we can do additional analysis
I realized after I had initially installed temporal that the numHistoryShards was set to the default so I deleted the cassandra DB, reinstalled it, updated the temporal charts and applied the update.
Currently, I’m only running one instance of the history service and cassandra is a 3 node cluster. My next step is to get the metrics setup. I might try adding an additional history service to see if that makes a difference.
one instance of the history service and cassandra is a 3 node cluster
^ that means one temporal host running 2048 shards’ background maintenance logic again one cass host (replication factor of 3)
so the capacity consumption can be expected
Does that mean if I have multiple instances of the history service the shards would be distributed across the instances and the cpu usage should decrease per history instance?