Problems cancelling a running activity from parent workflow

I have a workflow, which base on an incoming signal would need to stop any currently running activity (i.e a cancel event, need to stop whatever we are doing and go to cleanup)

First, I create an activityCtx with cancel function

activityCtx, cancelActivity := workflow.WithCancel(ctx)

Then launch my activity using that activityCtx.

And, when I get that signal, i invoke cancelActivity().

I can see the activity is cancel in workflow, as I get CanceledError when trying to get status of the ExecuteActivity:

activityErr = workflow.ExecuteActivity(activityCtx, SomeActivity, payload).Get(activityCtx, nil) 
  • this returns canceledError

Questions:

  • In SomeActivity, I am doing select on ctx.Done(). But the signal never came and hence my activity keeps running. I am sending heartbeat inside my activity: activity.RecordHeartBeat(ctx, 100) - where ctx is passed into the activity method (as first argument). Am my expectation correct? (i.e I should get ctx.Done inside activity when cancelFunc is called?)
  • In go sdk docs, there is this statement: Cancellation is only delivered to Activities that call RecordActivityHeartbeat - Is it really that specific function? or just activity.RecordHeartBeat is ok?

Also is attached my options used to start activity

options := workflow.ActivityOptions{
		StartToCloseTimeout: time.Hour,
		HeartbeatTimeout:    time.Minute,
		// Optionally provide a customized RetryPolicy.
		// Temporal retries failures by default, this is just an example.
		RetryPolicy: &temporal.RetryPolicy{
			InitialInterval:    time.Second,
			BackoffCoefficient: 1.6,
			MaximumInterval:    5 * time.Minute,
			MaximumAttempts:    1,
		},
	}

Does the activity call heartbeat periodically? Calling it once is not enough.

Yes, we have a ticker which calls heartbeat periodically. We have tried various intervals 1 sec, 5 sec, 0.5 sec…

I can see even after cancelFunc() is called in parent workflow, the activity still keeps going with those heartbeat (I am logging the attempts).

To give a bit more context, I launched a goroutine inside activity to wait for success REST call (I believe you can launch go routine inside activity).

And then I have a ticker channel, and have select which either gets Ctx.Done(), the go routine sends back results to a channel, or just log heartbeat

I have tried logging heartbeat once or not before launching goroutine. does not make a diff

Do you see the activity cancellation in the workflow history? It is of ActivityTaskCancelRequested event type.

Oh, I think the cancel event triggered the workflow to be closed too soon?

I see my workflow is completed, but with a pending activity

activityId

39

activityType.name

SomeActivity

state

CancelRequested

heartbeatDetails

[
  100
]

lastHeartbeatTime

May 18, 2021 5:09 PM

lastStartedTime

2021-05-19T00:09:50.000Z

I do see ActivityTaskCancelRequested

I suppose need to wait till all activity is closed before shut down workflow? how do you synchronize? Or do we need to?

Activity is going to get its context cancelled on the next heartbeat if a workflow is closed. So no need to wait for it to cancel unless you need to execute some logic after that.

  1. Is there anyway to make the activity stop right away? how long does it take before the ActivityTaskCancelRequested comes up before activity is shut down?
  2. I am seeing even if workflow is closed, the heartbeat logging is still happening (meaning the activity is still running).
  3. There might be another activity which needs to cleanup after cancel, so we need to make sure current activity is indeed done first

I do notice if I adjust the heartbeat timeout to be shorter when starting an activity, it will close sooner.

I am still confuse, giving I see ActivityTaskCancelRequested, should I expect the ctx.Done() in my activity to be triggered?

To avoid excessive calls to the service heartbeats are not send to the service up to 80% of the heartbeat interval. So if you want to speed up the notification of the activity about the cancellation/workflow completion reduce the heartbeat interval from 1 minute to some lower value.

  1. There might be another activity which needs to cleanup after cancel, so we need to make sure current activity is indeed done first

Set ActivityOptions.WaitForCancellation to true to wait (by blocking on the activity Future) for an activity cancellation.

Will try that for sure.

Going back to my previous question, I should get ctx.Done() signal inside my activity correct?

Indeed, when I set WaitForCancellation to be true the function never returns… but again, if ctx.Done() never came to my activity I cannot exit

I think i understand now how it works… just want to confirm my understanding.

I have heart beat timeout of 20 seconds. so 80% is 16 seconds.
I beat every 2 seconds.

Cancel is called… so at around 16 second time frame than I will get ctx.Done(), which seems to be what I am observing… is that correct?

Yes, it is correct. We have plans to separate cancellation from the heartbeat and deliver it immediately. But for now, this is the way it works.

Got it thanks… Thanks so much for your help