Hi,
our worker service keep crashing with OOM issue (seems periodically), we have set the requested memory as 100G. Didnt see anything wrong from the log. Could someone give some pointers for debugging this?
temporal-worker-8c984ccf8-74zpn 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-7k2hj 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-8v5cg 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-bxndp 1/1 Running 17 86m
temporal-worker-8c984ccf8-dxhzs 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-mg7j7 0/1 CrashLoopBackOff 16 86m
temporal-worker-8c984ccf8-smq8x 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-ttxd4 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-vp62x 0/1 CrashLoopBackOff 16 83m
temporal-worker-8c984ccf8-wrg2m 0/1 CrashLoopBackOff 16 86m
Ports: 7239/TCP, 9102/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Tue, 25 Oct 2022 13:48:42 +0800
Finished: Tue, 25 Oct 2022 13:52:11 +0800
Ready: False
Restart Count: 16
the log when it gets crashed:
{“level”:“info”,“ts”:“2022-10-25T05:47:27.556Z”,“msg”:“Current reachable members”,“service”:“worker”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“10.32.179.59:7239”,“10.32.59.111:7239”,“10.32.54.30:7239”,“10.32.40.204:7239”,“10.32.190.16:7239”,“10.32.22.220:7239”,“10.32.144.80:7239”,“10.32.39.14:7239”,“10.32.187.129:7239”,“10.32.41.208:7239”,“10.32.151.67:7239”,“10.32.32.86:7239”,“10.32.48.114:7239”,“10.32.146.140:7239”,“10.32.40.41:7239”,“10.32.3.133:7239”],“logging-call-at”:“rpServiceResolver.go:266”}
{“level”:“info”,“ts”:“2022-10-25T05:47:32.777Z”,“msg”:“Current reachable members”,“service”:“worker”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“10.32.32.86:7239”,“10.32.190.16:7239”,“10.32.151.67:7239”,“10.32.59.111:7239”,“10.32.3.133:7239”,“10.32.54.30:7239”,“10.32.179.59:7239”,“10.32.40.41:7239”,“10.32.39.14:7239”,“10.32.40.204:7239”,“10.32.41.208:7239”,“10.32.48.114:7239”,“10.32.187.129:7239”,“10.32.144.80:7239”,“10.32.22.220:7239”],“logging-call-at”:“rpServiceResolver.go:266”}
{“level”:“info”,“ts”:“2022-10-25T05:47:33.661Z”,“msg”:“Current reachable members”,“service”:“worker”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“10.32.32.86:7239”,“10.32.190.16:7239”,“10.32.151.67:7239”,“10.32.59.111:7239”,“10.32.3.133:7239”,“10.32.54.30:7239”,“10.32.40.41:7239”,“10.32.39.14:7239”,“10.32.40.204:7239”,“10.32.41.208:7239”,“10.32.48.114:7239”,“10.32.187.129:7239”,“10.32.144.80:7239”,“10.32.22.220:7239”],“logging-call-at”:“rpServiceResolver.go:266”}