Handling user feedback, retries, and timeouts

mrsaints · July 14, 2020, 5:01pm

Originally on Slack.

I have a couple of points I want to verify, and questions. They are mostly in the context of the Go client.

On the topic of human interaction, let’s say I need to capture a user’s details (e.g. through a signal), act on it (e.g. through an activity), and give a response to the user. Would the simplest option here be a signal, and query? And is it generally acceptable to query for structs rather than simple strings?
If I have a somewhat long-running activity that fails midway through because the VM running the worker shuts down, I’m guessing the activity will eventually fail assuming it has a timeout? And I can handle this case by setting a retry policy?
What’s the best way to handle a workflow timeout? For example, I want to carry out some compensation action if a workflow times out

maxim · July 14, 2020, 5:29pm

The simplest is signal and then query. Temporal guarantees read after write consistency for this interaction. Yes, structs are supported as query results. This answer gives more options for implementing synchronous workflow updates.
Correct, the simplest way to retry it is by setting a retry policy.
Don’t rely on workflow timeout for any business logic as they are essentially “kill -9” on timer. Use timers inside the workflow for any business-related timeouts. In your case set a timer that performs cleanup when fired.

mrsaints · July 14, 2020, 6:53pm

I’m giving this a try now.

Assuming I want to have a workflow-wide timeout, with a clean up activity. Would the idea be to use a selector, and a timer like so:

selector := workflow.NewSelector(ctx)
timerFuture := workflow.NewTimer(ctx, workflowWideTimeout)
selector.AddFuture(timerFuture, func(f workflow.Future) {
    if ShouldCleanup {
        _ = workflow.ExecuteActivity(ctx, CleanupActivity).Get(ctx, nil)
    }
})

And for every activity:

f := workflow.ExecuteActivity(ctx, DoSomethingActivity)
selector.AddFuture(f, func(f workflow.Future) {
    _ = f.Get(ctx, &doSomethingResult)
})

Followed by a:

selector.Select(ctx)

So that either the timeout kicks in or the activity completes.

If so, I suppose my next question is, what about error handling within the future? How does that get propagated back to the workflow? Would that be a matter of mutating some err, and checking for it?

maxim · July 14, 2020, 7:10pm

I think your approach is unnecessary complicated. I would start a separate goroutine to execute the main workflow logic.

func MyWorkflow(ctx workflow.Context) {
    cancellableCtx, cancelHandler := workflow.WithCancel(ctx)
    workflow.Go(cancellableCtx, func(ctx workflow.Context) {
         MyWorkflowImp(ctx)
    })
     workflow.Sleep(ctx, CleanupInterval)
     cancelHandler()
     workflow.ExecuteActivity(ctx, CleanupActivity).Get(ctx, nil)
}

mrsaints · July 14, 2020, 8:23pm

I see! That’s very helpful.

I’m guessing the workflow will continue to run even after the Go routine finishes? And I can make it exit if the Go routine finishes by using a timer, and a channel:

selector := workflow.NewSelector(ctx)
selector.AddFuture(workflow.NewTimer(ctx, CleanupTimeout), func(f workflow.Future) {})

cancellableCtx, cancelHandler := workflow.WithCancel(ctx)
doneCh := workflow.NewChannel(ctx)
selector.AddReceive(doneCh, func(c workflow.ReceiveChannel, m bool) {})

workflow.Go(cancellableCtx, func(ctx workflow.Context) {
    MyWorkflowImp(ctx)
    doneCh.Send(ctx, "done")
})

selector.Select(ctx)

cancelHandler()

workflow.ExecuteActivity(ctx, CleanupActivity).Get(ctx, nil)

maxim · July 14, 2020, 8:50pm

This would work. Instead of doneCh you could also use a Future, created through workflow.NewFuture.

blynch · March 6, 2023, 7:15pm

Is there a similar example with the Java SDK? Specifically I’m wondering if there’s a way to create a newCancellationScope where the containing activities do not need to be invoked explicitly async.

maxim · March 6, 2023, 7:49pm

Invoke the CancellationScope.run through the Async.procedure.

blynch · March 6, 2023, 7:54pm

I see, thanks. Does this look correct?

val cancellationScope = Workflow.newCancellationScope(Runnable { syncBusinessLogic() })
val procedure = Async.procedure { cancellationScope.run() }
try {
    procedure.get(7, TimeUnit.DAYS)
} catch (e: TimeoutException) {
    cancellationScope.cancel()
    cleanup()
}

maxim · March 6, 2023, 7:58pm

Yes. You might also wait for the completion of the cancellation of syncBusinessLogic.

blynch · March 6, 2023, 7:59pm

Thanks. How do I do that? The return type of cancel() is void.

maxim · March 6, 2023, 8:01pm

It depends on your business logic. For example, you could pass a SettablePromise into syncBusinessLogic as a parameter and then wait on it during cleanup.

blynch · March 6, 2023, 8:03pm

Wouldn’t calling cancel() ensure that no further activities defined within the cancellationScope are run?

maxim · March 6, 2023, 8:08pm

You might wait for the cancellation of already running activities, for example. It is also possible to run compensations in a disconnected context.

blynch · March 6, 2023, 8:11pm

I see. I had hoped that CancellationScope.cancel would propagate the cancellation to any containing activities. Is there no way to achieve that behavior?

maxim · March 6, 2023, 8:13pm

It does propagate. But an activity can take 10 hours to acknowledge cancellation and you can decide to wait for it.

blynch · March 6, 2023, 8:16pm

Oh I see, for long-running activities? I.e. similar to a Java thread checking its interrupted state, the activity would need to check if it’s been signaled for cancellation?

Would an activity in a waiting-to-retry state be automatically canceled via propagation?

maxim · March 6, 2023, 8:22pm

activity would need to check if it’s been signaled for cancellation?

Activity must heartbeat to lean about cancellation.

Would an activity in a waiting-to-retry state be automatically canceled via propagation?

Yes, an activity waiting for retry will be canceled immediately.

Topic		Replies	Views
Timeout after 10s when querying a workflow Community Support go-sdk	6	1039	March 31, 2022
Interact Temporal Workflow from User Interface Community Support typescript-sdk	2	535	April 16, 2024
Handling Error in signal handler Community Support go-sdk	3	250	May 1, 2024
Long running workflow with signal and retries Community Support java-sdk , retries , signals , continue-as-new , workflow-options	5	2187	July 29, 2022
Validating Signal Handling Pattern in Infinite Workflow Community Support go-sdk , signals , continue-as-new	2	255	June 28, 2024

Handling user feedback, retries, and timeouts

Related topics