Unable to read schema on cassandra db from temporal services - when cassandra pods scaled down

Hi,

We have temporal deployed on K8S pointing to remote cassandra deployed on a separate cluster. Initially we had 3 pods in 5 node cassandra cluster and everything was working fine. Then, we scaled the pods to 5 and did not see any problem at that time as well.
Later, we scaled down the pods back to 3 and started seeing too huge latencies in starting workflow via async invocation. To investigate that further, when we restarted temporal services (history,matching,frontend) , encountered logs as below in temporal services while connecting to cassandra :

Unable to start server. Error: could not build arguments for function “go.temporal.io/server/common/pprof”.LifetimeHooks (/temporal/common/pprof/fx.go:39): failed to build *pprof.PProfInitializerImpl: could not build arguments for function “go.temporal.io/server/common/pprof”.NewInitializer (/temporal/common/pprof/pprof.go:56): failed to build *config.PProf: could not build arguments for function “go.temporal.io/server/temporal”.SoExpander (/temporal/temporal/fx.go:482): failed to build *temporal.serverOptions: received non-nil error from function “go.temporal.io/server/temporal”.ServerOptionsProvider (/temporal/temporal/fx.go:508): cassandra schema version compatibility check failed: unable to read DB schema version keyspace/database: temporal_dev5_4096 error: failed to get current schema version from cassandra

Facing this even when we have the schemas, and no changes were made to schema.
Below are the screenshots of rows in couple of tables in that schema :


@samar can you help to guide on same. We referred to old query as well : Temporal service fails to start in k8s: 'Failed to get current schema version from cassandra'

Please guide on this.
Note : We are facing this issue on scaling down the pods of cassandra.

Can someone please answer on this.

What is the keyspace strategy used (SimpleStrategy or NetworkTopologyStrategy)?
What is the replication factor used and did you change it when scaling down your cassandra cluster?

@tihomir {‘class’: ‘org.apache.cassandra.locator.SimpleStrategy’, ‘replication_factor’: ‘3’}. We had not changed the replication factor when scaling down.