Questions
- What is the technical constraint in temporal that prevents worklfows from containing non determinitic code?
- How does wrapping non deterministic code as an activity and invoking the activity from workflow overcome the constraint?
Questions
For example, the original execution of the workflow can execute activity A. But during replay, it might execute activity B which is not going to match the execution history.
if (random() > 0.5) {
ExecuteActivity("A");
} else {
ExecuteActivity("B");
}
Thanks Maxim.
In the below video, It says “In case of a failure, Temporal resumes the workflow to the exact line it was running before the crash.”
Isn’t what she told more conceptual than practical?
Because temporal deosn’t really store the line number where it crashed and directly resume from the point of failure?
Instead as you said it re-executes everything (pulling temporal api’s recorded responses)?
Is it right to say what is said in this video is conceptually correct but that’s not really the way it happens behind the scenes?
It’s conceptually correct, yes. Event history replay brings your workflow state to the exact point where the failure happened and can continue its execution. From the developer point of view, it can be considered “the exact line of code before the crash”, in most cases.
For example if you have a for loop that just adds 10 numbers , and your service crashes during the for loop, then during replay that for loop will be replayed from the beginning.
In case you have a for loop where inside you do operations such as invoking activities, workflow.sleep, etc, operations which write to the workflow event history, then yes, after a crash while you are in the loop it will be restored from the exact point where it stopped before.
The cool thing is the resume works across programming languages, as we have shown in our recent workshop demo - temporal-java-workshop/README.md at main · tsurdilo/temporal-java-workshop · GitHub. In the demo we have shown how a workflow that was started in PHP and failed during a for loop execution can be resumed by a Java workflow. Hope this helps
Thanks @tihomir.
Side Question
When we invoke query method on a completed workflow, are the same steps followed?
Event History is replayed to reach the completion state and then the query metod is executed?
Yes, the event history is replayed, setting the workflow state to the exact point when the query is evaluated. In case of closed workflows, it’s the entire workflow history.
Hi @tihomir, thanks for clarification. being a new bee in the temporal world, I have a similar question on determinism. Say I have the below activitis,
Now the “balance check” activity invokes some third-party API to fetch the balance and my workflow crashes during “debit amount” step. Now since this “balance check” step was completed, so will that be executed again during default retry operation? If no, then how to ensure that during retry, both “balance check” and “debit amount” steps get executed?
Also, what is meant by “operations which write to the workflow event history”? Does it mean, that only below 2 type of invocations writes to the event history,
Think maybe there is confusion between workflow replay and retries.
For your use case if check balance and debit activities need to be retries together on any failures of debit you could use Workflow.retry
(sample here if it helps) just note that each retry would add to event history so its not meant for a very large number of retires (event history has a 50K event history limit, recommended not to go over 10K).
Alternatively you could have the two activities in a child workflow and on failure of debit activity call continueasnew on child workflow to execute it again, to parent a child can continueasnew as many times its needed, it will look as single child execution.
what is meant by “operations which write to the workflow event history”?
Would maybe watch videos here and here for more info if it helps.
Thanks for the reply and references. I will go through and will gt back if needed
Following on from this example.
Assume check balance is a time sensitive operation further assume it is wrapped inside a activity and then lastly assume the worker crashes after executing 2.check balance (non-deterministic activity).
Now if the worker recovers a day later and the balance has changed say it has now gone into overdraft and the subsequent debit should be rejected, the workflow replay feature will use old results (from the previous day in my example) and issue a debit (not the desired result).
What is the most idiomatic way to tackle this scenario? In summary, what is the guidance when activity execution results are time sensitive?
The GetBalance, followed by the Debit operation, always contains a race condition. The balance could have changed between these operations. Temporal ensures that both will be called in the presence of various infra failures, potentially increasing the interval between them and making the race condition more probable.
There is no real solution to this problem. You can check GetBalance again if it is stale (by recording the time it was returned), but it will reduce the race condition window.