Activities are frequently timing out with minimal load

Activities are frequently timing out with minimal load where as its running perfectly with more load in another cluster.

Activities are taking more time to complete, we see “workflow task timeout” and its getting completed after several auto retries

Initially we saw the below error message and restarted the temporal pods

{“level”:“error”,“ts”:“2023-08-31T10:42:44.954Z”,“msg”:“Operation failed with internal error.”,“error”:“ListNamespaces operation failed. Error: Cannot achieve consistency level LOCAL_QUORUM”,“operation”:“ListNamespaces”,“logging-call-at”:“persistenceMetricClients.go:1171”,“stacktrace”:"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistenceMetricClients.go:1171\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistenc

After restarting the pods
we are still seeing workflow time out error with below errors in pod logs
{“level”:“info”,“ts”:“2023-09-12T13:04:26.205Z”,“msg”:“history client encountered error”,“service”:“frontend”,“error”:“shard status unknown”,“service-error-type”:“serviceerror.Unavailable”,“logging-call-at”:“metric_client.go:90”}

{“level”:“error”,“ts”:“2023-09-12T13:09:27.412Z”,“msg”:“Persistent store operation failure”,“service”:“matching”,“component”:“matching-engine”,“wf-task-queue-name”:“temporal-nova-worker-d64c445f4-hg7v5:e165d702-5c33-49c4-8e59-917da2f6c246”,“wf-task-queue-type”:“Workflow”,“wf-namespace”:“temporal-system”,“store-operation”:“update-task-queue”,“error”:“operation UpdateTaskQueue encountered Operation timed out - received only 0 responses.”,“logging-call-at”:“taskReader.go:212”,“stacktrace”:“go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/home/builder/temporal/service/matching/taskReader.go:212\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/home/builder/temporal/internal/goro/group.go:58”}

we did not face Error: Cannot achieve consistency level LOCAL_QUORUM" after pods restarts

Cannot achieve consistency level LOCAL_QUORUM

Would check Cassandra health, this typically has not to do with Temporal itself

received only 0 responses.

Again this can indicate Cassandra issues, as its not responding