Best way to kill and restart a workflow

I have a use case where if a workflow with the given id is open then i want to kill it and start a new one. This is what i am doing right now:

try {
    workflowHandler.startWorkflow(request);
} catch (final WorkflowAlreadyStartedException ex) {
    workflowHandler.terminateWorkflow(workflowId, "restart required");
    workflowHandler.startWorkflow(request);
}

workflowHandler is my wrapper class that does newWorkflowStub() and terminate() respectively

Problem is that when i terminate and immediately try to start a workflow with the same id i am getting WorkflowAlreadyStartedException. I assumed that terminate api call is strongly consistent but it doesn’t seem to be. Is this expected?

Also, Is this the best way to do it if i need kill and immediately start a new workflow with the same id? another option i see is to reset the open workflow (ResetWorkflowExecutionRequest) but i am guessing this adds on to the existing workflow history which i want to avoid.

I personally wouldn’t use terminate for this. I would use signalWithStart to start workflow if it is not running or signal an already running workflow. Upon receiving such a signal the running workflow would call continue as new to start the new one.

I believe the terminate workflow is fully consistent. But I still think that your solution has potential race conditions if the code in your sample is executed by two processes at the same time.

Ah nice, I didnt realize i could do continueAsNew inside the signal method. However as i understand continueAsNew starts a new workflow and continues from the existing workflow state right? is it possible to make it start from the beginning of the workflow

continueAsNew continues current workflow exec as a new run, so its alway a new execution. You can pass the current execution workflow state as parameters to the next execution if needed

Hi @tihomir, wondering if “continueAsNew” will terminate all the existing running activities, just like what “cancel” will do?

Is your scenario an async activity invocation and in your workflow code you explicitly call continueAsNew via Workflow.newContinueAsNewStub?
This will not terminate your running activity (it will complete, but its results will not be recorded in workflow history). This is the same as instead of continueAsNew you just completed your workflow execution while async activity is still running.

If you want to give your running activity a chance to gracefully complete, invoke it inside cancellation scope, and call scope.cancel() before you call continueAsNew in your workflow code. Note that your activities must heartbeat in this case to receive ActivityCanceledException.

So for example as mentioned in this thread for calling continueAsNew inside signal method:

public void mySignalMethod(String input) {
        activityCancellationScope.cancel();
        // allow activity to cancel gracefully
        activityPromise.get();
        continueAsNewStub.execute(input);
    }

will terminate all the existing running activities, just like what “cancel” will do?

Cancelling a workflow (from client untyped stub for example) does not stop running activities (neither does terminate). Their results are not recorded in workflow history.
An activity can chose to ignore a cancellation request and complete execution. For an async invoked activity ActivityCanceledException is delivered on the next heartbeat (that can happen after workflow is already canceled). In case if you set
.setCancellationType(ActivityCancellationType.WAIT_CANCELLATION_COMPLETED)
in your activity options, ActivityCompletionException is delivered via heartbeat.

@tihomir Thanks for your reply!

Is your scenario an async activity invocation and in your workflow code you explicitly call continueAsNew via Workflow.newContinueAsNewStub?

yes, this is my scenario. I have an async activity invocation and looking for a way to stop all async activity first and then restart a new workflow. One minor thing is, I am using Workflow.continueAsNew(args) instead of Workflow.newContinueAsNewStub.

So, with that, are you recommending:
(1) set the activity option as .setCancellationType(ActivityCancellationType.WAIT_CANCELLATION_COMPLETED)
(2) invoke it inside cancellation scope, and call scope.cancel() before you call continueAsNew in my workflow code
?

Thanks!

  1. If you have control over all your activities code and they heartbeat, then WAIT_CANCELLATION_COMPLETED is ok to use and make sure you handle ActivityCompletionException so you can perform some cleanup work before completing activity execution if necessary.
    If your activities do not heartbeat then they will not receive cancellation notification.
  2. Yes if 1 is true.

Sorry for newbie question. For if your activities do heartbeat, do you mean if I do activity.RecordHeartbeat in my workflow code?

Sorry for newbie question. For if your activities do heartbeat , do you mean if I do activity.RecordHeartbeat in my workflow code?

For an activity to heartbeat it should

  • Call activity.RecordHeartbeat from the activity code. Any code from activity package is not allowed to be called from the workflow code.
  • Set heartbeat timeout (ActivityOptions.HeartbeatTimeout) to a value smaller than activity start to close timeout. This is needed as heartbeat calls are throttled by the SDKs up to 4/5 of the heartbeat timeout. So if it is not set no heartbeat is sent to the service.

failed to find activity.RecordHeartbeat. Is that something only for go?

@Maggie all SDKs support heartbeating. Which one do you use?

We are using java SDK. Shall we use Activity.getExecutionContext().heartbeat()?

Yes, see sample here.

1 Like

gotcha, thanks!