the 1.6.1 patch fix a bug which activity heartbeat is not correctly handled.
in your case the issue has something related to workflow task handling, i.e. SDK received the workflow task, SDK cannot send back the result of workflow task within timeout.
It’s very unlikely that changes which you’ve mentioned above have anything to do with what you are seeing.
Are there any errors in the worker log? Do you have access to worker/server metrics?
Can you also post full stack trace of the first workflow task failure?
yes see null pointer issue, in workflow, but i am what i am wondering is why is that object null?
the object i am trying to access is the input which i passed to workflow (which iam anyway able to see it in workflow ui)
Blockquote
ERROR i.t.internal.worker.PollerOptions - uncaught exception
java.lang.RuntimeException: Failure processing workflow task. WorkflowId=jan20212701-wf, RunId=650e7cbe-5cbe-489e-92fa-1ae85fc005c8
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:342)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:280)
at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:79)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.temporal.internal.replay.InternalWorkflowTaskException: Failure handling event 11 of ‘EVENT_TYPE_WORKFLOW_TASK_STARTED’ type. IsReplaying=false, PreviousStartedEventId=11, workflowTaskStartedEventId=11, Currently Processing StartedEventId=11
at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:193)
at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleEvent(ReplayWorkflowRunTaskHandler.java:140)
at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:180)
at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:150)
at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithEmbeddedQuery(ReplayWorkflowTaskHandler.java:201)
at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:114)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:314)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:280)
at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
… 3 common frames omitted
Caused by: java.lang.RuntimeException: WorkflowTask: failure executing SCHEDULED->WORKFLOW_TASK_STARTED, transition history is [CREATED->WORKFLOW_TASK_SCHEDULED]
at io.temporal.internal.statemachines.StateMachine.executeTransition(StateMachine.java:140)
at io.temporal.internal.statemachines.StateMachine.handleHistoryEvent(StateMachine.java:91)
at io.temporal.internal.statemachines.EntityStateMachineBase.handleEvent(EntityStateMachineBase.java:63)
at io.temporal.internal.statemachines.WorkflowStateMachines.handleEventImpl(WorkflowStateMachines.java:210)
at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:178)
… 11 common frames omitted
Caused by: java.lang.NullPointerException: null
at MyWorkflowImpl.mySignalMethod(MyWorkflowImpl.java:437)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.temporal.internal.sync.WorkflowInternal.lambda$registerListener$155fbe99$1(WorkflowInternal.java:162)
at io.temporal.internal.sync.SignalDispatcher.handleInterceptedSignal(SignalDispatcher.java:100)
at io.temporal.internal.sync.SyncWorkflowContext.handleInterceptedSignal(SyncWorkflowContext.java:168)
at io.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation$RootWorkflowInboundCallsInterceptor.handleSignal(POJOWorkflowImplementationFactory.java:374)
at io.temporal.internal.sync.SignalDispatcher.handleSignal(SignalDispatcher.java:124)
at io.temporal.internal.sync.SyncWorkflowContext.handleSignal(SyncWorkflowContext.java:172)
at io.temporal.internal.sync.WorkflowExecuteRunnable.handleSignal(WorkflowExecuteRunnable.java:72)
at io.temporal.internal.sync.SyncWorkflow.lambda$handleSignal$2(SyncWorkflow.java:136)
at io.temporal.internal.sync.CancellationScopeImpl.run(CancellationScopeImpl.java:101)
at io.temporal.internal.sync.WorkflowThreadImpl$RunnableWrapper.run(WorkflowThreadImpl.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
… 3 common frames omitted
Is it possible that you perform SignalWithStart to start the workflow?
If so, signal is getting applied before workflow run method is called. (Applying signals before running workflow tasks is the way how we guarantee that signals are not getting lost)
Whatever is causing it, it looks like you are getting a signal at the same time when the workflow gets started. Can you rewrite your signal processing method to not require input being set?
When a workflow constantly throws a NullPointerException its execution gets blocked. So it cannot process both queries and signals in this state. So fix the bug and it is going to work as expected.
But what i notice is other workflows are also affected, and i am not able to query other workflows too . also any idea why the Workflow Task handler heart beat stops ?