Temporal service fails to start in k8s: 'Failed to get current schema version from cassandra'

Hi, we have a Temporal installed in k8s from the helm chart. Everything worked fine, then temporal service pods began to fail during the startup with the message:
cassandra schema version compatibility check failed: unable to read cassandra schema version keyspace/database: temporal error: Failed to get current schema version from cassandra

Pods that were already running, continued to operate. Only the newly created pods crashed. All the Init containers finished execution successfully.

I had to fully recreate a Temporal DB: deleted keyspaces and performed schema setup with tctl. After this, all the crashing pods started successfully.

Do you have any ideas what could have caused this? Which diagnostic should I do to obtain more data if this repeats?

Which database are you running with? And how did you install schema?

When you deploy schema using schema tool it creates schema_version and schema_update_history tables in the database. Temporal server performs certain validation on startup to make sure it is running with expected schema version. Is it possible that schema_version and schema_update_history was modified while the cluster is running? This might cause the issue you described above.

Next time it happens can you share the contents of these tables with us?

@samar, we use Cassandra from the Temporal Helm chart.
I suspect this may be caused by the restart of Cassandra nodes, but this issue with checking the schema version didn’t happen again for now.