Activities are frequently timing out with minimal load

rajashekhar · September 12, 2023, 1:24pm

Activities are frequently timing out with minimal load where as its running perfectly with more load in another cluster.

Activities are taking more time to complete, we see “workflow task timeout” and its getting completed after several auto retries

Initially we saw the below error message and restarted the temporal pods

{“level”:“error”,“ts”:“2023-08-31T10:42:44.954Z”,“msg”:“Operation failed with internal error.”,“error”:“ListNamespaces operation failed. Error: Cannot achieve consistency level LOCAL_QUORUM”,“operation”:“ListNamespaces”,“logging-call-at”:“persistenceMetricClients.go:1171”,“stacktrace”:"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistenceMetricClients.go:1171\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistenc

After restarting the pods
we are still seeing workflow time out error with below errors in pod logs
{“level”:“info”,“ts”:“2023-09-12T13:04:26.205Z”,“msg”:“history client encountered error”,“service”:“frontend”,“error”:“shard status unknown”,“service-error-type”:“serviceerror.Unavailable”,“logging-call-at”:“metric_client.go:90”}

{“level”:“error”,“ts”:“2023-09-12T13:09:27.412Z”,“msg”:“Persistent store operation failure”,“service”:“matching”,“component”:“matching-engine”,“wf-task-queue-name”:“temporal-nova-worker-d64c445f4-hg7v5:e165d702-5c33-49c4-8e59-917da2f6c246”,“wf-task-queue-type”:“Workflow”,“wf-namespace”:“temporal-system”,“store-operation”:“update-task-queue”,“error”:“operation UpdateTaskQueue encountered Operation timed out - received only 0 responses.”,“logging-call-at”:“taskReader.go:212”,“stacktrace”:“go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/home/builder/temporal/service/matching/taskReader.go:212\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/home/builder/temporal/internal/goro/group.go:58”}

we did not face Error: Cannot achieve consistency level LOCAL_QUORUM" after pods restarts

tihomir · September 19, 2023, 5:09am

Cannot achieve consistency level LOCAL_QUORUM

Would check Cassandra health, this typically has not to do with Temporal itself

received only 0 responses.

Again this can indicate Cassandra issues, as its not responding

Topic		Replies	Views
Namespace deletion workflow is failing due to request timeout Community Support	2	336	July 15, 2023
Operation updateShard encounter timeout Community Support history	9	916	June 28, 2022
Activity stuck after activity timeout Community Support activity , timeout	9	1706	June 2, 2021
Workflow task timeout after activity is completed Community Support go-sdk	5	49	April 21, 2025
Temporal Activity Poll & Start Delays - Issues under Load Community Support java-sdk , general-impl	6	716	May 24, 2023

Activities are frequently timing out with minimal load

Related topics