Deploy with remote SQL persistent

I attempted to deploy Temporal using this Docker Compose template, utilizing a remote PostgreSQL/MySQL database. The auto setup script executed successfully; however, the history service continuously emitted errors, likely related to connection issues. I’ve tried with GCP Cloud SQL, AWS RDS, and self-hosted PostgreSQL, but encountered the same problem in each case.

History logs:

{"level":"error","ts":"2023-09-21T08:15:00.346Z","msg":"Operation failed with internal error.","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","operation":"GetOrCreateShard","logging-call-at":"persistence_metric_clients.go:1281","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1281\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1258\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:176\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:178\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:169\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:173\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1661\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:00.346Z","msg":"Failed to load shard","shard-id":131,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","logging-call-at":"context_impl.go:1666","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1666\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:00.346Z","msg":"Error acquiring shard","shard-id":131,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","is-retryable":true,"logging-call-at":"context_impl.go:1874","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2.1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1874\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2\n\t/home/builder/temporal/service/history/shard/context_impl.go:1887\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:153\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:00.939Z","msg":"Unable to process new range","shard-id":153,"address":"172.25.0.7:7234","component":"timer-queue-processor","error":"shard status unknown","logging-call-at":"queue_base.go:316","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/queues.(*queueBase).processNewRange\n\t/home/builder/temporal/service/history/queues/queue_base.go:316\ngo.temporal.io/server/service/history/queues.(*scheduledQueue).processEventLoop\n\t/home/builder/temporal/service/history/queues/queue_scheduled.go:218"}
{"level":"error","ts":"2023-09-21T08:15:01.004Z","msg":"Operation failed with internal error.","error":"UpdateShard failed. Failed to start transaction. Error: context deadline exceeded","operation":"UpdateShard","logging-call-at":"persistence_metric_clients.go:1281","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1281\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1258\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).UpdateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:189\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).UpdateShard\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:191\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).UpdateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:182\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).UpdateShard\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:185\ngo.temporal.io/server/service/history/shard.(*ContextImpl).renewRangeLocked\n\t/home/builder/temporal/service/history/shard/context_impl.go:1131\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1833\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.004Z","msg":"Persistent store operation failure","shard-id":127,"address":"172.25.0.7:7234","store-operation":"update-shard","error":"UpdateShard failed. Failed to start transaction. Error: context deadline exceeded","shard-range-id":1,"previous-shard-range-id":0,"logging-call-at":"context_impl.go:1137","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).renewRangeLocked\n\t/home/builder/temporal/service/history/shard/context_impl.go:1137\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1833\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.004Z","msg":"Error acquiring shard","shard-id":127,"address":"172.25.0.7:7234","error":"UpdateShard failed. Failed to start transaction. Error: context deadline exceeded","is-retryable":true,"logging-call-at":"context_impl.go:1874","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2.1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1874\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2\n\t/home/builder/temporal/service/history/shard/context_impl.go:1887\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:153\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Operation failed with internal error.","error":"GetOrCreateShard: failed to get ShardID 146. Error: context deadline exceeded","operation":"GetOrCreateShard","logging-call-at":"persistence_metric_clients.go:1281","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1281\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1258\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:176\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:178\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:169\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:173\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1661\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Failed to load shard","shard-id":146,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to get ShardID 146. Error: context deadline exceeded","logging-call-at":"context_impl.go:1666","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1666\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Operation failed with internal error.","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","operation":"GetOrCreateShard","logging-call-at":"persistence_metric_clients.go:1281","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1281\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1258\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:176\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:178\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:169\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:173\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1661\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Error acquiring shard","shard-id":146,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to get ShardID 146. Error: context deadline exceeded","is-retryable":true,"logging-call-at":"context_impl.go:1874","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2.1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1874\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2\n\t/home/builder/temporal/service/history/shard/context_impl.go:1887\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:153\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Operation failed with internal error.","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","operation":"GetOrCreateShard","logging-call-at":"persistence_metric_clients.go:1281","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/common/persistence.updateErrorMetric\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1281\ngo.temporal.io/server/common/persistence.(*metricEmitter).recordRequestMetrics\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:1258\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:176\ngo.temporal.io/server/common/persistence.(*shardPersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_metric_clients.go:178\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard.func1\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:169\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/persistence.(*shardRetryablePersistenceClient).GetOrCreateShard\n\t/home/builder/temporal/common/persistence/persistence_retryable_clients.go:173\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1661\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Failed to load shard","shard-id":144,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","logging-call-at":"context_impl.go:1666","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).loadShardMetadata\n\t/home/builder/temporal/service/history/shard/context_impl.go:1666\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1821\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.348Z","msg":"Error acquiring shard","shard-id":142,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","is-retryable":true,"logging-call-at":"context_impl.go:1874","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2.1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1874\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2\n\t/home/builder/temporal/service/history/shard/context_impl.go:1887\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:153\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.349Z","msg":"Error acquiring shard","shard-id":148,"address":"172.25.0.7:7234","error":"GetOrCreateShard: failed to insert into shards table. Error: context deadline exceeded","is-retryable":true,"logging-call-at":"context_impl.go:1874","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2.1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1874\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func2\n\t/home/builder/temporal/service/history/shard/context_impl.go:1887\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:153\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}
{"level":"error","ts":"2023-09-21T08:15:01.513Z","msg":"Persistent store operation failure","shard-id":132,"address":"172.25.0.7:7234","store-operation":"update-shard","error":"UpdateShard operation failed. Failed to commit transaction. Error: sql: transaction has already been committed or rolled back","shard-range-id":1,"previous-shard-range-id":0,"logging-call-at":"context_impl.go:1137","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/history/shard.(*ContextImpl).renewRangeLocked\n\t/home/builder/temporal/service/history/shard/context_impl.go:1137\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard.func1\n\t/home/builder/temporal/service/history/shard/context_impl.go:1833\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:119\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:120\ngo.temporal.io/server/service/history/shard.(*ContextImpl).acquireShard\n\t/home/builder/temporal/service/history/shard/context_impl.go:1889"}

Worker logs:

{"level":"warn","ts":"2023-09-21T08:16:15.095Z","msg":"Failed to poll for task.","service":"worker","Namespace":"temporal-system","TaskQueue":"temporal-sys-per-ns-tq","WorkerID":"server-worker@1@fa0d092d9e63@temporal-system","WorkerType":"ActivityWorker","Error":"Not enough hosts to serve the request","logging-call-at":"internal_worker_base.go:308"}
{"level":"error","ts":"2023-09-21T08:16:17.143Z","msg":"error starting temporal-sys-history-scanner-workflow workflow","service":"worker","error":"context deadline exceeded","logging-call-at":"scanner.go:289","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflow\n\t/home/builder/temporal/service/worker/scanner/scanner.go:289\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflowWithRetry.func1\n\t/home/builder/temporal/service/worker/scanner/scanner.go:259\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflowWithRetry\n\t/home/builder/temporal/service/worker/scanner/scanner.go:258"}
{"level":"error","ts":"2023-09-21T08:16:17.143Z","msg":"error starting temporal-sys-tq-scanner-workflow workflow","service":"worker","error":"context deadline exceeded","logging-call-at":"scanner.go:289","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:156\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflow\n\t/home/builder/temporal/service/worker/scanner/scanner.go:289\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflowWithRetry.func1\n\t/home/builder/temporal/service/worker/scanner/scanner.go:259\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:145\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflowWithRetry\n\t/home/builder/temporal/service/worker/scanner/scanner.go:258"}

Not enough hosts to serve the request

Means that frontend seems up but no history and or matching services did not start (cluster needs at least one of each role to be functional), I assume history host failing/restarting
from your logs.

The history host errors all indicate possible db issues, would check your db cpu memory and iops.
Can you also share your static config on history host?

there are no peaks in the dB metrics. Postgres logs are flooded ‘PID in cancel request did not match any process.’"
here is the env of the history service (I just added the TLS var):

- DB=postgres12
- DB_PORT=${POSTGRES_DEFAULT_PORT}
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PWD=${POSTGRES_PWD}
- POSTGRES_SEEDS=${POSTGRES_SEEDS}
- SQL_TLS_ENABLED=true
- SQL_HOST_VERIFICATION=false
- DYNAMIC_CONFIG_FILE_PATH=config/dynamicconfig/development.yaml
- SERVICES=history
- USE_INTERNAL_FRONTEND=true
- LOG_LEVEL=error
# - BIND_ON_IP=0.0.0.0
- PROMETHEUS_ENDPOINT=0.0.0.0:8000
#      - TEMPORAL_BROADCAST_ADDRESS=temporal-history
- NUM_HISTORY_SHARDS=2048

@tihomir could you pls have any advice for me? thanks!