java.lang.RuntimeException: Failure processing workflow task. WorkflowId=3f22bd16-1e38-3134-9381-93bb15ccc4ee, RunId=fadced22-7be5-4b59-b8c1-c7badb761032
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:342)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:280)
at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:79)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at java.base/java.lang.Thread.run(Thread.java:832)
Caused by: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: UnhandledCommand
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156)
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.respondWorkflowTaskCompleted(WorkflowServiceGrpc.java:2667)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.lambda$sendReply$0(WorkflowWorker.java:376)
at io.temporal.internal.common.GrpcRetryer.lambda$retry$0(GrpcRetryer.java:109)
at io.temporal.internal.common.GrpcRetryer.retryWithResult(GrpcRetryer.java:127)
at io.temporal.internal.common.GrpcRetryer.retry(GrpcRetryer.java:106)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.sendReply(WorkflowWorker.java:369)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:318)
at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:280)
at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
... 3 common frames omitted
We found the workflow and it seems to have recovered. (Side question: is it possible to make these not ERROR log? Our system pages when see ERROR log)
This happens frequently. We feel there might be some bug. How do we figure out what was the failed task trying to do? How do we map it to the line of code?
Looking at it from the dashboard all we can do is guess that the workflow task was going to close the workflow but a signal came in before it was closed. But we are really not sure.
@Override
public void start(String orderId) {
this.orderId = orderId;
// TODO: Change from config in code to read from DB
Map<String, Object> config = Config.getConfig();
Duration maxDuration = (Duration) config.get("newStatusMaxDuration");
LOGGER.info(
String.format(
"Workflow started for order %s to check if it passed new status (new, onHold and toBeVaidated) in time. maxDuration: %s",
orderId, maxDuration.toString().substring(2).toLowerCase()));
Workflow.await(maxDuration, () -> this.passed);
if (this.passed) {
LOGGER.info(
String.format(
"Order %s passed new status (new, onHold and toBeVaidated) before fulfilledMaxDuration: %s",
orderId, maxDuration.toString()));
return;
}
String message =
String.format(
"Order %s did not passed new status (new, onHold and toBeVaidated) after fulfilledMaxDuration: %s",
orderId, maxDuration.toString().substring(2).toLowerCase());
LOGGER.error(message);
// act.error(NewStatusWorkflowImpl.class.getName(), message);
}
@Override
public void updateStatus(String status) {
LOGGER.info(
String.format(
"Workflow for order %s received signal to update status to: %s", this.orderId, status));
if (!NEW_ORDER_STATUS.contains(status)) {
this.passed = true;
}
}
This is not a bug. This error is used to ensure that signals are not lost due to race condition with workflow completion command. Would you file an issue to change the log level?
Thank you @maxim
I will file an issue on Github
I still have the question of looking at the Workflow task failure from the UI or the log, how do I know what was the task? Which line of code was it trying to execute?
@maxim Any thoughts on the general question that I asked? Looking at the UI/log for a workflow task failure, how to identify the line of code the task was trying to run?
The unhandled command is not associated with any particular line in the workflow as it is caused by workflow completion. So talking about a particular line that caused workflow completion doesnât make sense.
You are right. What I asked does not make sense.
Will every âunhandled commandâ means the task was trying to complete the workflow? If that is the case, would it be better to change the âunhandled commandâ to âcomplete workflow failedâ?
If there are other cases for âunhandled commandâ, what could they be?