Worker service CrashLoopBackOff issue

Newbie_Sr · October 25, 2022, 5:49am

Hi,
our worker service keep crashing with OOM issue (seems periodically), we have set the requested memory as 100G. Didnt see anything wrong from the log. Could someone give some pointers for debugging this?

temporal-worker-8c984ccf8-74zpn 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-7k2hj 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-8v5cg 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-bxndp 1/1 Running 17 86m
temporal-worker-8c984ccf8-dxhzs 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-mg7j7 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-smq8x 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-ttxd4 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-vp62x 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-wrg2m 0/1 CrashLoopBackOff 16 86m

Ports:          7239/TCP, 9102/TCP
Host Ports:     0/TCP, 0/TCP
State:          Waiting
  Reason:       CrashLoopBackOff
Last State:     Terminated
  Reason:       Error
  Exit Code:    137
  Started:      Tue, 25 Oct 2022 13:48:42 +0800
  Finished:     Tue, 25 Oct 2022 13:52:11 +0800
Ready:          False
Restart Count:  16

the log when it gets crashed:

{“level”:“info”,“ts”:“2022-10-25T05:47:27.556Z”,“msg”:“Current reachable members”,“service”:“worker”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“10.32.179.59:7239”,“10.32.59.111:7239”,“10.32.54.30:7239”,“10.32.40.204:7239”,“10.32.190.16:7239”,“10.32.22.220:7239”,“10.32.144.80:7239”,“10.32.39.14:7239”,“10.32.187.129:7239”,“10.32.41.208:7239”,“10.32.151.67:7239”,“10.32.32.86:7239”,“10.32.48.114:7239”,“10.32.146.140:7239”,“10.32.40.41:7239”,“10.32.3.133:7239”],“logging-call-at”:“rpServiceResolver.go:266”}
{“level”:“info”,“ts”:“2022-10-25T05:47:32.777Z”,“msg”:“Current reachable members”,“service”:“worker”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“10.32.32.86:7239”,“10.32.190.16:7239”,“10.32.151.67:7239”,“10.32.59.111:7239”,“10.32.3.133:7239”,“10.32.54.30:7239”,“10.32.179.59:7239”,“10.32.40.41:7239”,“10.32.39.14:7239”,“10.32.40.204:7239”,“10.32.41.208:7239”,“10.32.48.114:7239”,“10.32.187.129:7239”,“10.32.144.80:7239”,“10.32.22.220:7239”],“logging-call-at”:“rpServiceResolver.go:266”}
{“level”:“info”,“ts”:“2022-10-25T05:47:33.661Z”,“msg”:“Current reachable members”,“service”:“worker”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“10.32.32.86:7239”,“10.32.190.16:7239”,“10.32.151.67:7239”,“10.32.59.111:7239”,“10.32.3.133:7239”,“10.32.54.30:7239”,“10.32.40.41:7239”,“10.32.39.14:7239”,“10.32.40.204:7239”,“10.32.41.208:7239”,“10.32.48.114:7239”,“10.32.187.129:7239”,“10.32.144.80:7239”,“10.32.22.220:7239”],“logging-call-at”:“rpServiceResolver.go:266”}

tihomir · October 27, 2022, 12:26am

Hi, is there anything else in the logs? Don’t think the messages show any particular errors. Anything that might give indication worker service cannot connect to frontend?
What’s the server version you are deploying?

Topic		Replies	Views
Temporal Worker Crashes: JS OOM Error Community Support worker , typescript-sdk	2	754	March 12, 2024
Worker Service Pod Crashed Community Support kubernetes	13	2623	September 15, 2021
History Service OOM exception Community Support history , server	0	582	July 17, 2023
Worker keep going down Community Support typescript-sdk	5	154	November 6, 2024
History and worker service errors Community Support	2	497	October 20, 2022

Worker service CrashLoopBackOff issue

Related topics