Changing Shard size via Multi Cluster Replication?

I’ve been reading that Shard size can’t be changed once a cluster has been initialized. This is pretty well explained in a few technical documents, so I won’t ask anything else about that.

This did get me thinking though: With the experimental “Multi-cluster Replication” feature, does this enable a way for people to change shard size without having to stop all long-running workflows?

I have a few questions that should help to answer this:

  • When running Multi cluster replication, do all clusters need to have to same shard size?
  • Is it safe to turn off the “old” cluster once replication has finished without terminating long-running jobs? (like a job that sleeps for 30 days between runs)
  • Is it possible to run Multi cluster replication with different data stores? (ie, could I migrate from SQL → Cassandra via this path also?)

Hopefully these answers will help some other people out too. We’ve been evaluating setting up Temporal for our production infrastructure and using it to migrate from our AWS SQS-based infrastructure, but these scenarios are important for me before making we dedicate to that path.

Thank you! :pray:

Hi @free_lunasec

for multi-cluster replication

  1. Yes, the shard count must be the same on each cluster.
  2. Don’t think question relates to multi-cluster replication as replication never finishes and is always ongoing. Any update to a workflow in the global namespace will generate new replication tasks.
  3. Yes this is supported as replication is no concerned with persistence. It is done on Temporal servers API level.

** Note: Since service version 1.20 shard counts between replicating clusters no longer has to be same (but a multiple of) see Release v1.20.0 · temporalio/temporal · GitHub