Hello! I’m seeing these errors in the logs in our self-hosted temporal server every few hours; it’s always the same sequence of 3-in-a-row, occurring across different namespaces each time. We’re running a very basic installation, just 1 server running v1.22.4 with 4 workers running the python SDK v1.2.0.
I’ve seen forum posts that UpdateTaskQueue errors might be related to archival visibility, but we have it disabled. There doesn’t seem to be any actual impact on workflows. Any ideas what these indicate or tips for troubleshooting?
Thanks for any help or insight anyone has to offer!
{
"level": "error",
"ts": "2024-02-22T14:01:45.081Z",
"msg": "transaction rollback error",
"error": "sql: transaction has already been committed or rolled back",
"logging-call-at": "common.go:82",
"stacktrace": "go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence/sql.(*SqlStore).txExecute\n\t/home/builder/temporal/common/persistence/sql/common.go:82\ngo.temporal.io/server/common/persistence/sql.(*sqlTaskManager).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/sql/task.go:150\ngo.temporal.io/server/common/persistence.(*taskManagerImpl).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/task_manager.go:122\ngo.temporal.io/server/common/persistence.(*taskRateLimitedPersistenceClient).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/persistence_rate_limited_clients.go:514\ngo.temporal.io/server/common/persistence.(*taskPersistenceClient).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:611\ngo.temporal.io/server/common/persistence.(*taskRetryablePersistenceClient).UpdateTaskQueue.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:730\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*taskRetryablePersistenceClient).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:734\ngo.temporal.io/server/service/matching.(*taskQueueDB).UpdateState\n\t/home/builder/temporal/service/matching/db.go:225\ngo.temporal.io/server/service/matching.(*taskReader).persistAckLevel\n\t/home/builder/temporal/service/matching/task_reader.go:313\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/home/builder/temporal/service/matching/task_reader.go:211\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/home/builder/temporal/internal/goro/group.go:58"
}
{
"level": "error",
"ts": "2024-02-22T14:01:45.081Z",
"msg": "Operation failed with internal error.",
"error": "Failed to lock task queue. Error: context canceled",
"operation": "UpdateTaskQueue",
"logging-call-at": "persistence_metric_clients.go:1281",
"stacktrace": "go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1281\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1258\ngo.temporal.io/server/common/persistence.(*taskPersistenceClient).UpdateTaskQueue.func1\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:609\ngo.temporal.io/server/common/persistence.(*taskPersistenceClient).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:611\ngo.temporal.io/server/common/persistence.(*taskRetryablePersistenceClient).UpdateTaskQueue.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:730\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*taskRetryablePersistenceClient).UpdateTaskQueue\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:734\ngo.temporal.io/server/service/matching.(*taskQueueDB).UpdateState\n\t/home/builder/temporal/service/matching/db.go:225\ngo.temporal.io/server/service/matching.(*taskReader).persistAckLevel\n\t/home/builder/temporal/service/matching/task_reader.go:313\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/home/builder/temporal/service/matching/task_reader.go:211\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/home/builder/temporal/internal/goro/group.go:58"
}
{
"level": "error",
"ts": "2024-02-22T14:01:45.081Z",
"msg": "Persistent store operation failure",
"service": "matching",
"component": "matching-engine",
"wf-task-queue-name": "/_sys/[REDACTED]/1",
"wf-task-queue-type": "Activity",
"wf-namespace": "[REDACTED]",
"store-operation": "update-task-queue",
"error": "Failed to lock task queue. Error: context canceled",
"logging-call-at": "task_reader.go:214",
"stacktrace": "go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/home/builder/temporal/service/matching/task_reader.go:214\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/home/builder/temporal/internal/goro/group.go:58"
}