Temporal Activity vs LocalActivity behavior on Workflow replay

Hi temporal team,

Would love to understand how activity and local-activity replays work on workflow re-execution.
Let us assume the following workflow execution sequence for the same workflow.

WE1 - Workflow Execution 1
WE2 - Workflow Execution 2
LA1, LA2 - Local Activities
A1, A2 - Activities
S, F - Success, Failure

Activity

Let us assume the following sequence with A3 failing because it reached maxActivityRetry attempts.
WE1 - A1 (S) → A2 (S) → A3 (F)

Local Activity

Let us assume the following sequence with LA3 failing because it reached maxActivityRetry attempts.
WE3 - LA1 (S) → LA2 (S) → LA3 (F).

Questions

  1. In WE2 [Retry of WE1], I am assuming only LA3 would get re-executed. Is my assumption correct ? So the sequence would be

WE2 → A1 [No Op] → A2 [No Op] → A3 [Retry]

  1. In WE4 [Retry of WE3], I am assuming only LA3 would get re-executed. Is my assumption correct ?
    WE4 → LA1 [No Op] → LA2 [No Op] → LA3 [Retry]

  2. The event log in temporal UI shows different date/time stamps for A1 [Original execution] and A1 [No-op]. Is that intended ?

  3. Would the same behavior apply for activities run in parallel as well ?

Workflow retry, which is reexecuting workflow from the beginning, is entirely unrelated to workflow replay, which is used to recover a workflow to its current state.

Why are you even talking about workflow retry? We don’t recommend retrying workflows in the 99% of use cases. That’s why workflows don’t have retry options by default as opposed to activities that have them.

@maxim : Apologies for the mixing the terminology up. In case of maximumAttempts being exceeded on an activity, would the workflow be replayed ?

No. It will fail. Why do you want to set maxAttempts?

Got it. I want to use maxAttempts to retry after a few minutes in case of transient failures. I was originally thinking of using the workflow retry to retry after a few minutes.

Here is my scenario

  1. There are 1000 items inside a batch that I need to process. Each batch would start as a workflow and the items would be processed inside a function. These call an internal system with low latency (~100ms) to process the item.
  2. There is no guarantee that these 1000 inside a batch that would succeed - as there might be transient issues.
  3. I would like to reprocess only the failed items after a few minutes - as these are transient errors.

There are two options that I can choose from.

  1. Using Regular activities -
    - This causes heavy load on the system. So we want to avoid them if local activities work.
  2. Using Local activities
    - I would love to understand the behavior better before using them. In case of local activities, how do I make sure 1) All items are processed successfully and reliably. 2) Successful items are not processed again ?

Thanks again !
Kumar

Is processing of an item idempotent? In some failure cases all local activities are reexecuted, even the completed ones.