I’ve noticed that there are multiple CancelledFailure exceptions being thrown when canceling a workflow execution in production pods. By reading temporal document, I saw it mentioned:
CancelledFailure is thrown in a Workflow when a cancellation scope or the entire Workflow has been cancelled or set as the cause for when a child Workflow or Activity has been cancelled.
Trying to better understand, is CancelledFailure exception technically an exception?
===================================================================================
An example workflow execution having CancelledFailure exception just in case.
Trying to better understand, is CancelledFailure exception technically an exception?
Yes, see HelloCancellationScope example in java samples repo. In case of activity cancellation it would be the cause of ActivityFailure, in case of child workflow cancellation would be cause of ChildWorkflowFailure.
Depending on your workflow code when a workflow cancel is requested, you could see this thrown more than once for example if you try/catch it and try to send more commands to the server for this execution. For this you can run “cleanup” code in detached cancellation scope as shown in this sample.
I doubt the CancelledFailure happened in our case is caused by ActivityFailure. Instead, I think the CancelledFailure exception occurs at the same time when our client service sends a cancel requests. Is that expected that we will see it every time that we try to cancel a workflow if we don’t run “cleanup” code?
And for the solution by runing “cleanup” code, can you help elaborate a bit what this “cleanup” code does? (Let me know if there is any doc/video I missed). Besides, if we apply this “cleanup” code, should we still throw this exception after cleanup code, as this example does?
It depends on when cancellation request was received and what your workflow code is doing, so would probably need to prepare for all possible cases and do error handling where needed.
Cancellation will cancel the workflow context and you can wrap the whole workflow method and catch CanceledFailure:
Any activities you might want to invoke in the catch block (typically called “cleanup”) for example if you wanted to update a db or notify some service about cancellation etc need to be done in detached cancellation scope.
For activities that might be executing when cancel request is received, how cancellation is handled is based on ActivityOptions->setCancellationType (see ActivityCancellationType for more info).
With WAIT_CANCELLATION_COMPLETED and TRY_CANCEL your activities need to heartbeat to receive cancelation request as shown here, handle it and re-throw the error but it can also chose to ignore the cancellation and keep running for a long time if they want.
For child workflows with WAIT_CANCELLATION_COMPLETED, WAIT_CANCELLATION_REQUESTED, when you cancel the parent workflow, you should be able to catch CanceledFailure in your child and can perform some actions again in detached scope if needed.
In your workflow code you can use your typical error handling for activities and child workflows and check for CancelFailure as well:
Thank you @tihomir for the detailed explanation! This is really helpful! We are actually using local activities. Do you recommend the same typical error handling for local activities as you pasted above?