How to identify what was failing when "Failure processing workflow task" happens

Gordon_Sun · March 25, 2021, 4:32am

We see error log like this

java.lang.RuntimeException: Failure processing workflow task. WorkflowId=3f22bd16-1e38-3134-9381-93bb15ccc4ee, RunId=fadced22-7be5-4b59-b8c1-c7badb761032
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:342)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:280)
	at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:79)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
	at java.base/java.lang.Thread.run(Thread.java:832)
Caused by: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: UnhandledCommand
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262)
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243)
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156)
	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.respondWorkflowTaskCompleted(WorkflowServiceGrpc.java:2667)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.lambda$sendReply$0(WorkflowWorker.java:376)
	at io.temporal.internal.common.GrpcRetryer.lambda$retry$0(GrpcRetryer.java:109)
	at io.temporal.internal.common.GrpcRetryer.retryWithResult(GrpcRetryer.java:127)
	at io.temporal.internal.common.GrpcRetryer.retry(GrpcRetryer.java:106)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.sendReply(WorkflowWorker.java:369)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:318)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:280)
	at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
	... 3 common frames omitted

We found the workflow and it seems to have recovered. (Side question: is it possible to make these not ERROR log? Our system pages when see ERROR log)

This happens frequently. We feel there might be some bug. How do we figure out what was the failed task trying to do? How do we map it to the line of code?

Looking at it from the dashboard all we can do is guess that the workflow task was going to close the workflow but a signal came in before it was closed. But we are really not sure.

Thank you very much!

Gordon_Sun · March 25, 2021, 4:35am

Our @workflowmethod and @signalmethod

  @Override
  public void start(String orderId) {
    this.orderId = orderId;

    // TODO: Change from config in code to read from DB
    Map<String, Object> config = Config.getConfig();
    Duration maxDuration = (Duration) config.get("newStatusMaxDuration");
    LOGGER.info(
        String.format(
            "Workflow started for order %s to check if it passed new status (new, onHold and toBeVaidated) in time. maxDuration: %s",
            orderId, maxDuration.toString().substring(2).toLowerCase()));

    Workflow.await(maxDuration, () -> this.passed);
    if (this.passed) {
      LOGGER.info(
          String.format(
              "Order %s passed new status (new, onHold and toBeVaidated) before fulfilledMaxDuration: %s",
              orderId, maxDuration.toString()));
      return;
    }
    String message =
        String.format(
            "Order %s did not passed new status (new, onHold and toBeVaidated) after fulfilledMaxDuration: %s",
            orderId, maxDuration.toString().substring(2).toLowerCase());
    LOGGER.error(message);
    //    act.error(NewStatusWorkflowImpl.class.getName(), message);
  }

  @Override
  public void updateStatus(String status) {
    LOGGER.info(
        String.format(
            "Workflow for order %s received signal to update status to: %s", this.orderId, status));
    if (!NEW_ORDER_STATUS.contains(status)) {
      this.passed = true;
    }
  }

maxim · March 25, 2021, 5:12pm

This is not a bug. This error is used to ensure that signals are not lost due to race condition with workflow completion command. Would you file an issue to change the log level?

This topic has some info on the details.

Gordon_Sun · March 26, 2021, 12:59am

Thank you @maxim
I will file an issue on Github
I still have the question of looking at the Workflow task failure from the UI or the log, how do I know what was the task? Which line of code was it trying to execute?

Gordon_Sun · March 26, 2021, 1:06am

Gordon_Sun · March 29, 2021, 7:26pm

@maxim Any thoughts on the general question that I asked? Looking at the UI/log for a workflow task failure, how to identify the line of code the task was trying to run?

maxim · March 29, 2021, 7:42pm

The unhandled command is not associated with any particular line in the workflow as it is caused by workflow completion. So talking about a particular line that caused workflow completion doesn’t make sense.

Gordon_Sun · March 29, 2021, 8:37pm

You are right. What I asked does not make sense.
Will every ‘unhandled command’ means the task was trying to complete the workflow? If that is the case, would it be better to change the ‘unhandled command’ to ‘complete workflow failed’?
If there are other cases for ‘unhandled command’, what could they be?

maxim · March 29, 2021, 8:42pm

Naming is hard :). At this point we are not going to make any backward incompatible changes to APIs if possible.

If there are other cases for ‘unhandled command’, what could they be?

Any event that was added to the history while the workflow task that decided to close workflow was runnig.

Topic		Replies	Views
Failed workflow task due to unhandled command Community Support java-sdk	9	2796	November 3, 2021
Caused by: io.temporal.internal.statemachines.InternalWorkflowTaskException: Failure handling event 6 of type 'EVENT_TYPE_WORKFLOW_TASK_STARTED' during execution Community Support	1	29	August 18, 2025
WorkflowTaskFailed: Failure handling event 5 of 'EVENT TYPE ACTIVITY TASK SCHEDULED' Community Support	3	2702	March 16, 2021
Problems during workflow execution Community Support java-sdk , general-impl	3	269	April 30, 2024
WorkflowTaskFailed doesn't update (remains frozen wrt failure reason) making it a bit hard to debug Community Support	3	86	August 12, 2024

How to identify what was failing when "Failure processing workflow task" happens

Related topics