Valid use of a Heartbeat timer?

jpalof · November 11, 2024, 7:17pm

We are looking at migrating a long running job from a legacy Java application to a Temporal workflow. The job can be thought of as a large batch update. It takes a list of items as input from the user and may take a few minutes to a few hours to complete depending on the input.

It has the following phases of processing:

The application determines additional impacted items through analysis of dependencies of the user specified items
The application then processes the user specified and impacted items, updating them in-memory
Assuming all items have been successfully updated, only then are the changes committed to the legacy DB, otherwise the in-memory changes are cleared
The application builds a report of the results
The application sends a completion notification

Phases 1, 4, and 5 would make sense as distinct activities. Due to the in-memory processing of 2, both 2 & 3 would need to be in a single activity based on my Temporal understanding.

My dilemma lies in the fact that activity cancellation is dependent on Heartbeats and we want the cancellation during 2 to be responsive. I can sprinkle heartbeats into 2 so they occur as frequently as desired. However, the timeout of the activity would need to be no less that the largest anticipated DB commit. Thus, the responsiveness of the cancelation would roughly be no better than the configured timeout due to Heartbeat throttling.

It seems to me that if I want timely cancellations, 2 & 3 really need to be distinct activities, implying that 2 would need to be redesigned to store the intermediate changes vs the in-memory approach. That would require more work than we can afford on the legacy system.

One alternative I found in a forum post (Best practices for long-running activities) was to create a component that simply issues the heartbeat on a timer. That way I could have a faster heartbeat for 2 and simply issue timer based heartbeats during the commit. Would that be a reasonable approach?

I am new to Temporal, but I have worked my way through the documentation, some of the samples, and courses and apologize if I am missing something basic.

maxim · November 12, 2024, 12:57am

There are a couple of approaches to help here.

You can route activities to hosts or even specific processes. So you can ensure that the activities access the same in-memory cache if needed.
You can implement your own “cancel” activity that would be invoked at the same host. This way heartbeating wouldn’t be required to deliver cancellation. It still will be required to detect worker process failure.

jpalof · November 12, 2024, 11:52pm

Thank you for the suggested approaches! I have a few follow questions on the first approach and am still digesting the second.

“… route activities to hosts or even specific processes.”

To confirm my understanding of this approach, the key is to have both 2 & 3 use the same worker. In an ideal world, this would happen due to the sticky worker preference of Temporal. But given timeouts, worker eviction, etc., a sticky worker can’t be assumed and thus needs to be guaranteed through task queues.
Where I am confused is determinism. If activity 2 (in-memory) completes, doesn’t it need to reflect the in-memory changes into the event history so that the workflow is deterministic if activity 3 fails? Is there a means to make sure that 2 would be re-run if 3 fails?

Thanks

maxim · November 13, 2024, 12:09am

Routing activities to specific processes always use process-specific task queues. See the fileproceshttps://github.com/temporalio/samples-java/tree/main/core/src/main/java/io/temporal/samples/fileprocessingsing sample.
Activities don’t need to be deterministic—only workflow code. In the case of caching, the whole sequence of activities needs to be executed at a different process. This is also shown in the fileprocessing sample.

jpalof · November 15, 2024, 3:13pm

Regarding the second approach

To confirm my understanding of the approach, with the assumption that of routing to the same host, I would add a “cancel” Temporal activity to my workflow. This activity would set a flag that the other activities would inspect, at key points in their logic, to abort processing. If I have that right, I have two more follow up questions:

What exception should my activities throw when responding to the cancel to inform the workflow properly? Would it be ActivityCompletionException?
I am thinking that I will use a signal on the workflow to call the “cancel” activity, since my workflow implementation will already have the activity stub available to directly call it.

maxim · November 15, 2024, 4:25pm

As this is not a built-in Temporal cancellation, use an ApplicationFailure or complete the activity successfully and add this information to the result.
Yes, this makes sense.

jpalof · November 15, 2024, 5:11pm

My apologies for continued questions… Using ApplicationFailure surfaces this as a Failed workflow rather than a Canceled workflow. I would like to have the workflow to be reflected as canceled. Is that possible?

maxim · November 15, 2024, 7:01pm

I thought you were asking about the exception thrown from an activity, not the workflow.

Topic		Replies	Views
Java example/sample for activity cancellation in response to a signal? Community Support java-sdk	4	2347	April 28, 2022
How to set heartbeat timeout to handle heart beats and cancellation Community Support java-sdk , heartbeat	9	1032	May 28, 2024
Temporal Local Activity cancellation points Community Support java-sdk , local-activity	3	334	August 9, 2023
Custom cleanup on activity cancelation Community Support java-sdk	9	931	June 8, 2022
Heartbeat interceptor Community Support java-sdk , activity	6	37	May 5, 2025

Valid use of a Heartbeat timer?

Related topics