PotentialDeadlockException, when does this happen?

Hi,

While experimenting. I get this error

io.temporal.internal.replay.InternalWorkflowTaskException: Failure handling event 3 of 'EVENT_TYPE_WORKFLOW_TASK_STARTED' type. IsReplaying=false, PreviousStartedEventId=3, workflowTaskStartedEventId=3, Currently Processing StartedEventId=3 
io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:193)
io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleEvent(ReplayWorkflowRunTaskHandler.java:140)
io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:180)
io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:150)
io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithEmbeddedQuery(ReplayWorkflowTaskHandler.java:201)
io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:114)
io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:319)
io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:279)
io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:834)

Caused By: java.lang.RuntimeException: WorkflowTask: failure executing SCHEDULED->WORKFLOW_TASK_STARTED, transition history is [CREATED->WORKFLOW_TASK_SCHEDULED] 
io.temporal.internal.statemachines.StateMachine.executeTransition(StateMachine.java:140)
io.temporal.internal.statemachines.StateMachine.handleHistoryEvent(StateMachine.java:91)
io.temporal.internal.statemachines.EntityStateMachineBase.handleEvent(EntityStateMachineBase.java:63)
io.temporal.internal.statemachines.WorkflowStateMachines.handleEventImpl(WorkflowStateMachines.java:210)
io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:178)
io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleEvent(ReplayWorkflowRunTaskHandler.java:140)
io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:180)
io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:150)
io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithEmbeddedQuery(ReplayWorkflowTaskHandler.java:201)
io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:114)
io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:319)
io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:279)
io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:834)

Caused By: io.temporal.internal.sync.PotentialDeadlockException: Potential deadlock detected: workflow thread "workflow-method" didn't yield control for over a second. Other workflow 

The flow goes so

Kafkaevent stream → calls a funcion → calls a workflow → calls an activity

I am a bit lost on debugging this. any help

Are you getting this error during debugging?

For debugging please set the TEMPORAL_DEBUG env variable to true – https://docs.temporal.io/docs/java/testing-and-debugging

If not, could you share your workflow code (or parts of it?). Could be that you are not using the dedicated workflow sleep methods (Workflow.sleep) or have some other type of non-deterministic problems.

Hope this helps.

thank you. that was it. I just had to set TEMPORAL_DEBUG :slight_smile:

Make sure to set it only when you plan to put breakpoints into the workflow code. Any other use might obscure real serious problems with the workflow code.

1 Like

I am running 100k workflows on temporal, in workflow i have multiple activities but next activity depends on previous activity result , some of activity have external api calls, workflow.sleep,

getting issue potential deadlock [TMPRL1101] Potential deadlock detected: workflow goroutine “2” didn’t yield for over a second StackTrace coroutine 2 [running]:

This can happen if the worker’s process is CPU starved/throttled.

I have 8GB RAM AND 250 harddisk mac laptop, running 10k workflows, its causing worker’s process is CPU starved/throttled.? what should worker config for runnning 100k workflows

Are you using a single binary version of the service? It is not intended for any performance tests.