We see error log like this
java.lang.RuntimeException: Failure processing workflow task. WorkflowId=3f22bd16-1e38-3134-9381-93bb15ccc4ee, RunId=fadced22-7be5-4b59-b8c1-c7badb761032 at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:342) at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:280) at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:79) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) at java.base/java.lang.Thread.run(Thread.java:832) Caused by: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: UnhandledCommand at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262) at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243) at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156) at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.respondWorkflowTaskCompleted(WorkflowServiceGrpc.java:2667) at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.lambda$sendReply$0(WorkflowWorker.java:376) at io.temporal.internal.common.GrpcRetryer.lambda$retry$0(GrpcRetryer.java:109) at io.temporal.internal.common.GrpcRetryer.retryWithResult(GrpcRetryer.java:127) at io.temporal.internal.common.GrpcRetryer.retry(GrpcRetryer.java:106) at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.sendReply(WorkflowWorker.java:369) at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:318) at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:280) at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73) ... 3 common frames omitted
We found the workflow and it seems to have recovered. (Side question: is it possible to make these not ERROR log? Our system pages when see ERROR log)
This happens frequently. We feel there might be some bug. How do we figure out what was the failed task trying to do? How do we map it to the line of code?
Looking at it from the dashboard all we can do is guess that the workflow task was going to close the workflow but a signal came in before it was closed. But we are really not sure.
Thank you very much!