Unable to read schema on cassandra db from temporal services - when cassandra pods scaled down

poojabhutada · June 30, 2022, 7:57am

Hi,

We have temporal deployed on K8S pointing to remote cassandra deployed on a separate cluster. Initially we had 3 pods in 5 node cassandra cluster and everything was working fine. Then, we scaled the pods to 5 and did not see any problem at that time as well.
Later, we scaled down the pods back to 3 and started seeing too huge latencies in starting workflow via async invocation. To investigate that further, when we restarted temporal services (history,matching,frontend) , encountered logs as below in temporal services while connecting to cassandra :

Unable to start server. Error: could not build arguments for function “go.temporal.io/server/common/pprof”.LifetimeHooks (/temporal/common/pprof/fx.go:39): failed to build *pprof.PProfInitializerImpl: could not build arguments for function “go.temporal.io/server/common/pprof”.NewInitializer (/temporal/common/pprof/pprof.go:56): failed to build *config.PProf: could not build arguments for function “go.temporal.io/server/temporal”.SoExpander (/temporal/temporal/fx.go:482): failed to build *temporal.serverOptions: received non-nil error from function “go.temporal.io/server/temporal”.ServerOptionsProvider (/temporal/temporal/fx.go:508): cassandra schema version compatibility check failed: unable to read DB schema version keyspace/database: temporal_dev5_4096 error: failed to get current schema version from cassandra

Facing this even when we have the schemas, and no changes were made to schema.
Below are the screenshots of rows in couple of tables in that schema :

@samar can you help to guide on same. We referred to old query as well : Temporal service fails to start in k8s: 'Failed to get current schema version from cassandra'

Please guide on this.
Note : We are facing this issue on scaling down the pods of cassandra.

poojabhutada · July 1, 2022, 5:11pm

Can someone please answer on this.

tihomir · July 1, 2022, 6:45pm

What is the keyspace strategy used (SimpleStrategy or NetworkTopologyStrategy)?
What is the replication factor used and did you change it when scaling down your cassandra cluster?

poojabhutada · July 25, 2022, 1:50pm

@tihomir {‘class’: ‘org.apache.cassandra.locator.SimpleStrategy’, ‘replication_factor’: ‘3’}. We had not changed the replication factor when scaling down.

Topic		Replies	Views
Temporal service fails to start in k8s: 'Failed to get current schema version from cassandra' Community Support	2	2211	October 20, 2020
Helm chart deployment issues Community Support	5	1207	January 22, 2021
Temporal version 1.22.4: Unable to start server. Error: sql schema version compatibility check failed Community Support	3	611	April 1, 2025
Hi, error in upgrading temporal server image from 1.17.4 to 1.18.0/1.18.1 Server Deployment server	18	2280	October 25, 2022
Unable to connect to Azure postgresql flexi database from temporal Server Deployment	2	465	March 18, 2025

Unable to read schema on cassandra db from temporal services - when cassandra pods scaled down

Related topics