Hi @tihomir , @maxim ,
With temporal in prod , we are facing one issue where kubernetes pods are getting restarted suddenly.
below errors are seen while they restart.
Few points
- We are using spring boot.
- instantiating workflows like this
stub = workflowClient.newUntypedWorkflowStub(workflowName, options);
stub.start(arg);
return (T) stub.getResult(clazz);
- Some of our activities are calling external apis and are slow
- we have timeout of 60 sec , at times activities timeout too
- might be possible ( not sure) that we are runing lesser instances of temporal services like history,matching compared to load
Do you think any of the above can cause the pods to restart?
2022-02-09 09:30:25,529 [Workflow Executor taskQueue="queue_default", namespace="default": 957] ERROR i.t.i.r.ReplayWorkflowTaskHandler - Workflow task failure. startedEventId=3, WorkflowId=WF_ID_93bdf8b3-f43d-4158-9d73-8c85eb742a11_93bdf8b3-f43d-4158-9d73-8c85eb742a11_1644399016547, RunId=add8ee41-16fe-45e2-9072-b2ca1bacf92c. If seen continuously the workflow might be stuck.
io.temporal.internal.replay.InternalWorkflowTaskException: Failure handling event 3 of 'EVENT_TYPE_WORKFLOW_TASK_STARTED' type. IsReplaying=false, PreviousStartedEventId=3, workflowTaskStartedEventId=3, Currently Processing StartedEventId=3
at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:188)
at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleEvent(ReplayWorkflowRunTaskHandler.java:140)
at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:180)
at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:150)
at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithEmbeddedQuery(ReplayWorkflowTaskHandler.java:201)
at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:114)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:319)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:279)
at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: WorkflowTask: failure executing SCHEDULED->WORKFLOW_TASK_STARTED, transition history is [CREATED->WORKFLOW_TASK_SCHEDULED]
at io.temporal.internal.statemachines.StateMachine.executeTransition(StateMachine.java:140)
at io.temporal.internal.statemachines.StateMachine.handleHistoryEvent(StateMachine.java:91)
at io.temporal.internal.statemachines.EntityStateMachineBase.handleEvent(EntityStateMachineBase.java:63)
at io.temporal.internal.statemachines.WorkflowStateMachines.handleEventImpl(WorkflowStateMachines.java:205)
at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:173)
... 11 common frames omitted
Caused by: io.temporal.internal.sync.PotentialDeadlockException: Potential deadlock detected: workflow thread "workflow-root" didn't yield control for over a second. Other workflow threads:
at java.lang.String.toCharArray(String.java:2899)
at sun.reflect.generics.parser.SignatureParser.parseTypeSig(SignatureParser.java:187)
at sun.reflect.annotation.AnnotationParser.parseSig(AnnotationParser.java:436)
at sun.reflect.annotation.AnnotationParser.parseAnnotation2(AnnotationParser.java:241)
at sun.reflect.annotation.AnnotationParser.parseAnnotations2(AnnotationParser.java:120)
at sun.reflect.annotation.AnnotationParser.parseAnnotations(AnnotationParser.java:72)
at java.lang.reflect.Executable.declaredAnnotations(Executable.java:602)
at java.lang.reflect.Executable.declaredAnnotations(Executable.java:600)
at java.lang.reflect.Executable.getAnnotation(Executable.java:570)
at java.lang.reflect.Method.getAnnotation(Method.java:622)
at io.temporal.common.metadata.POJOActivityMethodMetadata.<init>(POJOActivityMethodMetadata.java:38)
at io.temporal.common.metadata.POJOActivityInterfaceMetadata.getActivityInterfaceMethods(POJOActivityInterfaceMetadata.java:174)
at io.temporal.common.metadata.POJOActivityInterfaceMetadata.<init>(POJOActivityInterfaceMetadata.java:117)
at io.temporal.common.metadata.POJOActivityInterfaceMetadata.newInstance(POJOActivityInterfaceMetadata.java:92)
at io.temporal.internal.sync.ActivityInvocationHandlerBase.init(ActivityInvocationHandlerBase.java:54)
at io.temporal.internal.sync.ActivityInvocationHandler.<init>(ActivityInvocationHandler.java:57)
at io.temporal.internal.sync.ActivityInvocationHandler.newInstance(ActivityInvocationHandler.java:45)
at io.temporal.internal.sync.WorkflowInternal.newActivityStub(WorkflowInternal.java:231)
at io.temporal.workflow.Workflow.newActivityStub(Workflow.java:400)