Switching to Queries Tab in UI breaks workflow execution

Self-hosted Temporal Cluster 1.21.2, Java SDK 1.19.1

I have following workflow definition: Problematic Workflow · GitHub

Workflow runs as expected until I go to Queries tab in UI. And then all kind of weird staff begins. First of all on Queries tab there is no query result but there is no error either.
When I return to History then I see “Workflow Task Failed” event and workflow execution is stuck after this event.
Event Details:

Спойлер
{
  "message": "Failure handling event 10 of type 'EVENT_TYPE_WORKFLOW_TASK_COMPLETED' during replay. {WorkflowTaskStartedEventId=99, CurrentStartedEventId=9}",
  "source": "JavaSDK",
  "stackTrace": "io.temporal.internal.statemachines.WorkflowStateMachines.createEventProcessingException(WorkflowStateMachines.java:257)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEventsBatch(WorkflowStateMachines.java:236)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.applyServerHistory(ReplayWorkflowRunTaskHandler.java:224)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:156)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithQuery(ReplayWorkflowTaskHandler.java:131)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:96)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handleTask(WorkflowWorker.java:407)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:317)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:259)\nio.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:105)\njava.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\njava.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\njava.base/java.lang.Thread.run(Thread.java:833)\n",
  "encodedAttributes": null,
  "cause": {
    "message": "WorkflowTask: failure executing STARTED->WORKFLOW_TASK_COMPLETED, transition history is [CREATED->WORKFLOW_TASK_SCHEDULED, SCHEDULED->WORKFLOW_TASK_STARTED]",
    "source": "JavaSDK",
    "stackTrace": "io.temporal.internal.statemachines.StateMachine.executeTransition(StateMachine.java:152)\nio.temporal.internal.statemachines.StateMachine.handleHistoryEvent(StateMachine.java:102)\nio.temporal.internal.statemachines.EntityStateMachineBase.handleEvent(EntityStateMachineBase.java:68)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleSingleEvent(WorkflowStateMachines.java:277)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEventsBatch(WorkflowStateMachines.java:234)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.applyServerHistory(ReplayWorkflowRunTaskHandler.java:224)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:156)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithQuery(ReplayWorkflowTaskHandler.java:131)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:96)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handleTask(WorkflowWorker.java:407)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:317)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:259)\nio.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:105)\njava.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\njava.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\njava.base/java.lang.Thread.run(Thread.java:833)\n",
    "encodedAttributes": null,
    "cause": {
      "message": "Operation allowed only while eventLoop is running",
      "source": "JavaSDK",
      "stackTrace": "io.temporal.internal.statemachines.WorkflowStateMachines.checkEventLoopExecuting(WorkflowStateMachines.java:1052)\nio.temporal.internal.statemachines.WorkflowStateMachines.randomUUID(WorkflowStateMachines.java:708)\nio.temporal.internal.replay.ReplayWorkflowContextImpl.scheduleActivityTask(ReplayWorkflowContextImpl.java:193)\nio.temporal.internal.sync.SyncWorkflowContext.executeActivityOnce(SyncWorkflowContext.java:294)\nio.temporal.internal.sync.SyncWorkflowContext.executeActivity(SyncWorkflowContext.java:268)\nio.temporal.internal.sync.ActivityStubImpl.executeAsync(ActivityStubImpl.java:50)\nio.temporal.internal.sync.ActivityStubBase.execute(ActivityStubBase.java:39)\nio.temporal.internal.sync.ActivityInvocationHandler.lambda$getActivityFunc$0(ActivityInvocationHandler.java:78)\nio.temporal.internal.sync.ActivityInvocationHandlerBase.invoke(ActivityInvocationHandlerBase.java:60)\njdk.proxy2/jdk.proxy2.$Proxy228.reconcileMetadata(Unknown Source)\ncom.detmir.marketplace.media.service.temporal.workflows.ReconcileFilesMetadataWorkflow$Impl.reconcileFilesMetadata(ReconcileFilesMetadataWorkflow.java:75)\njava.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\njava.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)\njava.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.base/java.lang.reflect.Method.invoke(Method.java:568)\nio.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation$RootWorkflowInboundCallsInterceptor.execute(POJOWorkflowImplementationFactory.java:341)\nio.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation.execute(POJOWorkflowImplementationFactory.java:316)\n",
      "encodedAttributes": null,
      "cause": null,
      "applicationFailureInfo": {
        "type": "java.lang.IllegalStateException",
        "nonRetryable": false,
        "details": null
      }
    },
    "applicationFailureInfo": {
      "type": "java.lang.RuntimeException",
      "nonRetryable": false,
      "details": null
    }
  },
  "applicationFailureInfo": {
    "type": "io.temporal.internal.statemachines.InternalWorkflowTaskException",
    "nonRetryable": false,
    "details": null
  }
}

After terminating workflow I try to run new execution of workflow which fails with “WorkflowTaskFailed” event after executing GetFilesWithContent.

Спойлер
{
  "message": "Failure handling event 9 of type 'EVENT_TYPE_WORKFLOW_TASK_STARTED' during execution. {WorkflowTaskStartedEventId=9, CurrentStartedEventId=9}",
  "source": "JavaSDK",
  "stackTrace": "io.temporal.internal.statemachines.WorkflowStateMachines.createEventProcessingException(WorkflowStateMachines.java:257)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEventsBatch(WorkflowStateMachines.java:236)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.applyServerHistory(ReplayWorkflowRunTaskHandler.java:224)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:156)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithQuery(ReplayWorkflowTaskHandler.java:131)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:96)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handleTask(WorkflowWorker.java:407)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:317)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:259)\nio.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:105)\njava.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\njava.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\njava.base/java.lang.Thread.run(Thread.java:833)\n",
  "encodedAttributes": null,
  "cause": {
    "message": "WorkflowTask: failure executing SCHEDULED->WORKFLOW_TASK_STARTED, transition history is [CREATED->WORKFLOW_TASK_SCHEDULED]",
    "source": "JavaSDK",
    "stackTrace": "io.temporal.internal.statemachines.StateMachine.executeTransition(StateMachine.java:152)\nio.temporal.internal.statemachines.StateMachine.handleHistoryEvent(StateMachine.java:102)\nio.temporal.internal.statemachines.EntityStateMachineBase.handleEvent(EntityStateMachineBase.java:68)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleSingleEvent(WorkflowStateMachines.java:277)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEventsBatch(WorkflowStateMachines.java:234)\nio.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.applyServerHistory(ReplayWorkflowRunTaskHandler.java:224)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:208)\nio.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:156)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithQuery(ReplayWorkflowTaskHandler.java:131)\nio.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:96)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handleTask(WorkflowWorker.java:407)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:317)\nio.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:259)\nio.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:105)\njava.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)\njava.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)\njava.base/java.lang.Thread.run(Thread.java:833)\n",
    "encodedAttributes": null,
    "cause": {
      "message": "Operation allowed only while eventLoop is running",
      "source": "JavaSDK",
      "stackTrace": "io.temporal.internal.statemachines.WorkflowStateMachines.checkEventLoopExecuting(WorkflowStateMachines.java:1052)\nio.temporal.internal.statemachines.WorkflowStateMachines.randomUUID(WorkflowStateMachines.java:708)\nio.temporal.internal.replay.ReplayWorkflowContextImpl.scheduleActivityTask(ReplayWorkflowContextImpl.java:193)\nio.temporal.internal.sync.SyncWorkflowContext.executeActivityOnce(SyncWorkflowContext.java:294)\nio.temporal.internal.sync.SyncWorkflowContext.executeActivity(SyncWorkflowContext.java:268)\nio.temporal.internal.sync.ActivityStubImpl.executeAsync(ActivityStubImpl.java:50)\nio.temporal.internal.sync.ActivityStubBase.execute(ActivityStubBase.java:39)\nio.temporal.internal.sync.ActivityInvocationHandler.lambda$getActivityFunc$0(ActivityInvocationHandler.java:78)\nio.temporal.internal.sync.ActivityInvocationHandlerBase.invoke(ActivityInvocationHandlerBase.java:60)\njdk.proxy2/jdk.proxy2.$Proxy228.reconcileMetadata(Unknown Source)\ncom.detmir.marketplace.media.service.temporal.workflows.ReconcileFilesMetadataWorkflow$Impl.reconcileFilesMetadata(ReconcileFilesMetadataWorkflow.java:75)\njava.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\njava.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)\njava.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.base/java.lang.reflect.Method.invoke(Method.java:568)\nio.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation$RootWorkflowInboundCallsInterceptor.execute(POJOWorkflowImplementationFactory.java:341)\nio.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation.execute(POJOWorkflowImplementationFactory.java:316)\n",
      "encodedAttributes": null,
      "cause": null,
      "applicationFailureInfo": {
        "type": "java.lang.IllegalStateException",
        "nonRetryable": false,
        "details": null
      }
    },
    "applicationFailureInfo": {
      "type": "java.lang.RuntimeException",
      "nonRetryable": false,
      "details": null
    }
  },
  "applicationFailureInfo": {
    "type": "io.temporal.internal.statemachines.InternalWorkflowTaskException",
    "nonRetryable": false,
    "details": null
  }
}

I can run new workflow execution without errors only after restarting worker process.

Hi @Andrei-Moiseev

Operation allowed only while eventLoop is running

I have seen this message related to reusing (local)activity stubs across workflow executions. Is this your case?

Antonio

I think it’s not my case, all activity stubs are created inside workflow.

Can you share a piece of code we can compile to reproduce the issue?

What these methods are doing? withTotalFiles withProcessedFiles and withFixedFiles

        public Stat setTotalFiles(long count) {
            return withTotalFiles(count);
        }

        public Stat update(long processedCount, long fixedCount) {
            return withProcessedFiles(processedFiles + processedCount)
                .withFixedFiles(fixedFiles + fixedCount);
        }

Antonio

That’s the case, thank you!

The problem is in static modifier here:

private static final ReconcileFilesMetadataActivity reconcileMetadata = Workflow.newActivityStub(
            ReconcileFilesMetadataActivity.class,
            ActivityOptions.newBuilder()
                .setStartToCloseTimeout(Duration.ofMinutes(10))
                .setHeartbeatTimeout(Duration.ofSeconds(30))
                .validateAndBuildWithDefaults());

Some kind of validation would be useful. For now I made following ArchUnit check to prevent such errors:

@ArchTest
    public static final ArchRule noStaticMembersInTemporalWorkflow = members()
        .that().areDeclaredInClassesThat()
            .implement(annotatedWith(WorkflowInterface.class))
        .should()
            .notHaveModifier(STATIC)
        .allowEmptyShould(true);