What is missing in my understanding of determinism and event History of a Workflow receiving Signals

I have this understanding:

Temporal workflows must be deterministic, meaning that on replay, they should generate the exact same event history as the original execution.

However, I am confused about how the event history is structured when a workflow receives a signal.

  • Suppose, during the first execution of a workflow, a signal is received while the workflow is processing activity-1.
  • The machine crashes, and the workflow replay happens.
  • Based on my understanding of determinism, the signal should arrive at the exact same time during replay for it to be placed in the same position in the event history. If the signal comes at a different time, i would assume of above understanding that it will cause non-determinism since the event history might not match the original execution.

However, I have tested this in a PoC, and I don’t get a Non-Deterministic Execution (NDE) error because of this. So what am I missing in my understanding?

Also, I have observed that I don’t need to send the signal again during the second execution (after replay). Why is that? Is Temporal storing and replaying signals as part of the event history to maintain determinism? If yes how ? In what order ?

Would love to get some clarification on how signal processing works within Temporal’s deterministic execution model.

Based on my understanding of determinism, the signal should arrive at the exact same time during replay for it to be placed in the same position in the event history.

Your understanding is based on the wrong premise that history is recreated on replay. There is no creation of the history on replay. The signals are recorded into event history first at the service side. And only then the updated history is delivered to the workflow, even when the new signal events are processed for the first time. This way, from the SDK point of view, the original processing and replay are the same. The only difference is that the commands generated during replay are ignored (after checking them against the commands recorded in the history for determinism).

Let’s look at an example. A workflow that waits for signal at the beginning is going to have the following history if a signal is not received yet.

1. WorkflowExecutionStarted
2. WorkflowTaskScheduled
3. WorkflowTaskStarted
4. WorkflowTaskCompleted

This history indicates that the workflow has started and didn’t produce any new commands as it is blocked waiting for a signal.

Then a signal is sent. The following history will be generated:

5. WorkflowExecutionSignaled
6. WorkflowTaskScheduled - a workflow task is created to process the new signaled event.

Then when a workflow worker picks up the workflow task it will see these new 3 events:

5. WorkflowExecutionSignaled
6. WorkflowTaskScheduled
7. WorkflowTaskStarted

Then it will get all new events from the history. In this case it is only the signaled event and apply them to the workflow code. The signal handler will fire and might produce new commands and the workflow will continue.

Imagine that while processing the workflow task started at (7) the worker crashes. Then when the other worker will see the following history when it gets a workflow task which was generated as a retry:

1. WorkflowExecutionStarted
2. WorkflowTaskScheduled
3. WorkflowTaskStarted
4. WorkflowTaskCompleted
5. WorkflowExecutionSignaled
6. WorkflowTaskScheduled
7. WorkflowTaskStarted
8. WorkfowTaskTimedOut // Due to worker crash it fails with WorkflowTaskTimeout
9. WorkflowTaskScheduled // schedule workflow task retry
10. WorkflowTaskStarted // The retried task is picked up by a new worker.

Note that when this history is replayed the WorkflowExecutionSignaled event is at exactly the same position (5) as it was processed for the first time. So there is no problem with determinism.

1 Like