I triggered a long running activity with say 10 records. During the 3rd record if for some reason, it takes more time and the heartbeat failed. The whole activity is retried from the 1st record which results in duplicate processing. From the youtube video, understood this is the expected behaviour. However, this results in duplicate downstream activites. (Say intead of lauching 10 workflows for 10 records, it launches, 3(first attempt) + 10 (On retry) = 13 child workflows for each record. How to avoid this kind of behaviour and only launch the child workflows according to the original number of records(say 10) ?
There are two separate issues here:
Activity Losing Progress After Heartbeat Timeout
You can include the record id in the heartbeat as details. The latest recorded value is available when an activity is retried. See the heartbeatingactivity sample.
Note that heartbeat calls are throttled. So it is OK to call heartbeat on every record, but the frequency of actual calls to the service will depend on the heartbeat timeout. So it is still possible to process the same record multiple times.
Starting Duplicated Workflows
This should be solved by assigning business-level id as WorkflowId. Temporal guarantees the uniqueness of workflow executions by its id. So if the duplicated start is issued for the same WorkflowId, the call will fail with the AlreadyStarted error.
Also, familiarize yourself with WorkflowIdReusePolicy, which defines the start behavior if the previously started workflow with the same id already completed.