Greetings. I’ve been experimenting, trying to figure out the interaction between (i) cancellation-scopes and (ii) external cancellation of the workflow (i.e. with untypedWorkflowStub(…).cancel()). I’ve come across something odd. Following is with server 1.3.2 and Java SDK 1.0.3.
So have a parent-workflow that creates a child-workflow. The child workflow is coded to not complete (i.e. await(() -> false), to wait for a cancellation – just so I can see what happens.
Once a the parent/child workflows have been instantiated, I externally issue a cancel, based on the workflowId. If I look at the history of the parent-workflow instance, you see the following sequence of events:
WorkflowExecutionCancelRequested
WorkflowTaskScheduled
WorkflowTaskStarted
WorkflowTaskCompleted
Except . . . that the parent-workflow entrypoint is never called (from a debugger), which means the workflow is not actually cancelled, as it stays running. Now, I did notice that the cancellation event is sent into a sticky task-queue.
So have a parent-workflow that creates a child-workflow. The child workflow is coded to not complete (i.e. await(() → false
Do you mean literally with the code await(() -> false)? If so, there is no way the child will ever be notified. At least for our GoSDK the cancellation event comes through the ctx object via a Done() method which blocks until a cancellation is received.
Hey I actually noticed another issue which may not be causing a problem but definitely could. I see that you are generating a UUID for the child you create in the cancellation scope. The native language UUID is not a safe API within a workflow context. This is because it’s not deterministic and will yield different results when run multiple times.
I’ve debugged this issue for a while and I think I understand what’s going on.
When we create a child workflow in the cancellation scope, promise that is supposed to handle cancellation requests is getting attached to that new scope. That is okay, BUT when child scope run method completes, we also remove child scope and the cancellation promise from the chain, thus removing cancellation handler along with it. This results in all further cancellation requests being ignored. We need to discuss internally what the fix should be, for now I can confirm that this is a bug in the SDK.