History and worker service errors

Hi,
I get this error for worker service:

worker:
^[{“level”:“error”,“ts”:“2022-10-20T00:54:49.976Z”,“msg”:“error starting temporal-sys-tq-scanner-workflow workflow”,“service”:“worker”,“error”:“context deadline exceeded”,“logging-call-at”:“scanner.go:199”,“stacktrace”:“go.temporal.io/server/common/log.(*zapLogger).Error\n\t/temporal/common/log/zap_logger.go:142\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflow\n\t/temporal/service/worker/scanner/scanner.go:199\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflowWithRetry.func1\n\t/temporal/service/worker/scanner/scanner.go:176\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/temporal/common/backoff/retry.go:168\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/temporal/common/backoff/retry.go:192\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/temporal/common/backoff/retry.go:169\ngo.temporal.io/server/service/worker/scanner.(*Scanner).startWorkflowWithRetry\n\t/temporal/service/worker/scanner/scanner.go:175”}

history service is printing tons of this log:
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 2010 d950f2ff-eda7-4944-b86e-dd6e825eb11e 33323c76-49a2-4ac5-8233-7b06abb047a0 f08e9ebc-125b-4d21-a4ac-96267a036263
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 1555 d950f2ff-eda7-4944-b86e-dd6e825eb11e 871a4927-c6ee-48bf-9a83-8013d21e4896 43334518-fa0a-46a1-91f1-0b4955c62c01
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 1212 d950f2ff-eda7-4944-b86e-dd6e825eb11e cbc96c5a-3f2e-4a12-a541-04ffd881d331 f90fd606-41c0-4a3e-9ec6-d77c6b4f0eb5
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 1950 d950f2ff-eda7-4944-b86e-dd6e825eb11e f32bfd92-b4dd-47c1-8ef5-736b14a7427c 5238ca51-160f-4382-bf42-a0f91ec814f3
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 1725 d950f2ff-eda7-4944-b86e-dd6e825eb11e 95c953f8-d9f8-4584-b18c-3e270ff4e5c4 ee2a28cb-be46-43e2-b660-83da413893a4
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 1388 d950f2ff-eda7-4944-b86e-dd6e825eb11e e76256f3-7637-4dec-a13f-c09b13066b21 9d74fd67-959c-4a7e-a290-9ad1093b7fed
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 906 d950f2ff-eda7-4944-b86e-dd6e825eb11e c94a5954-2c69-495a-913f-2074343e616f aa46c6d2-a1d2-426f-a436-43903060b89a
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 896 d950f2ff-eda7-4944-b86e-dd6e825eb11e 9e739791-2f01-4abe-a871-1b36d491cad0 c863fcfe-6a3a-46b6-b0a1-48e2b59a7088
oom 1
oom 2
tom 1
SELECT EXEC QUERY - SELECT shard_id, namespace_id, workflow_id, run_id, next_event_id, last_write_version, data, data_encoding, state, state_encoding, db_record_version FROM executions
WHERE shard_id = ? AND namespace_id = ? AND workflow_id = ? AND run_id = ? 1582 d950f2ff-eda7-4944-b86e-dd6e825eb11e a48e8036-bf31-4bf1-b0f0-4cc51e8dcf18 f63de1c9-03fd-4d7b-8e6d-3e49b903fd17
tom 2
tom 3
tom 5

is the history log related to the worker error? could someone give some pointers? Does that oom mean out of memory?

Is this happening after a server update to a new version and same config worked on previous version, or is a clean cluster deployment? What server version are you using?

error starting temporal-sys-tq-scanner-workflow

This is typically worker service config issue, worker service has to be able to talk to frontend.
Could you check frontend health with tctl:

tctl cl h

does it return “Serving”?

is the history log related to the worker error? could someone give some pointers? Does that oom mean out of memory?

Haven’t seen this happen before, could be debug logging level?