I understand workflow threads operate using cooperative multithreading, so the thread is not pre-empted until it yields control. I also understand that await conditions are re-evaluated after a thread makes progress, but I am curious about the following potential race condition.
Let’s say update1 is waiting on a condition. Workflow thread sets the condition which will break the wait of update1. Another update2 came in and is waiting to execute. Is it also guaranteed that update1 will be executed before update2 since it was affected by the progress of the workflow thread?
Steps 1 through 4 are a single workflow task, and you should move 6 to 4.5 there and phrase it as “update1() await condition is reevaluated (like all wait conditions every event loop tick) so update1()’s coroutine is immediately completed in that same task”.
Your code would have update1 completed in the same task it was started execute-update-with-start’ed in. But yes, it is possible depending on situation that a var can get set to one thing and then set to another before a wait condition is evaluated if that’s what you are asking.
Ok so if I understand correctly, in this case step 6 (the await condition re-evaluation) will always happen after step 4, since step 4 is going to finish the workflow task. This makes sense and I think this was answered to me in a previous question I asked.
You also are saying update1() coroutine will immediately be completed, but we are using Java so my understanding is it has different thread for each update/signal/workflow which it will schedule once a thread makes progress and yields control. Will it be the same in Java? After the await is re-evaluated, this update thread will guaranteed to be scheduled next even if a signal comes in?
I guess more generally, let’s say there are multiple update threads which are awaiting a condition and other threads which are not. The workflow thread updates that condition which would cause the awaits to break. Are the threads awaiting the condition guaranteed to be scheduled before the threads which are not awaiting the condition? And if so, is there a guaranteed ordering among those which are awaiting?
Yes, this should be the same for Java. There is nothing that “comes in” while the workflow task is being processed, though technically the server may reject certain workflow completion situations if SDK tries to complete a task with a workflow completion and something “comes in” server-side while task was processing.
A workflow task for the most part is a CPU bound quick set of work to run the event loop until everything is waiting on external stimulus and no wait conditions evaluate to true.
While the ordering is guaranteed and deterministic, it’s not necessarily predictable (usually coroutines are processed in awaited order). But wait conditions are evaluated on each event loop after all coroutines are processed, and therefore we will not complete the workflow task if any wait conditions are satisfied. So it will always be satisfied after the field is set in that same workflow task (even if it may have also been evaluated before the field was set and was therefore false).
Hey thanks Chad, I think this makes sense. My understanding was that the SDK could receive an incoming signal/update and create a thread for the execution while the workflow task was executing, it just wouldn’t be executed until the next event loop tick. Maybe my understanding of how the signals get received by the server and made available to the workers is incorrect. If a signal arrives at temporal server in the middle of a workflow task or activity, is it only made available on the queue once the task/activity is completed?
While I am curious about the thread scheduling, the only thing I really needed to know was that if we had an update awaiting a condition and the workflow task changed that condition, would that be guaranteed to be run next before any signal/update that is not currently awaiting that condition.
This is not correct. The task completes and then the signal comes on a successive task (except for instances where the task completion has a workflow completion, which will be rejected by server, causing the task to be retried with the signal this time at least at first, rejecting future signals from user side if they keep coming).
This statement is mixing up wholly unrelated activity tasks with workflow tasks. Signals come in workflow tasks (as do activity completions, timer firings, etc), it makes no difference whether activities are running.
Only if the next signal/update is in a successive workflow task. If multiple signals/updates are being processed in the same task, we may not explicitly guarantee order of wait condition vs other coroutine scheduling within a task.
But you should not author your workflow as if whether an update arrives on a successive task or same task matters. In your original code, technically a workflow can receive bothupdate1 and update2 updates before workflow is called.
Thanks for this information Chad. I’m still not fully understanding how/when signals can be received by server and worker. Is there any documentation or learning you know of that goes in depth on this topic?
From the latter link, the video for the Java course section “Signals in Your Event History” may be particularly enlightening: https://www.youtube.com/watch?v=k7SVLvAsP-Y (though it is about workflows passing signals to each other, but it still helps understanding signal eventing and workflow tasks towards the end of the video where it talks about the recipient workflow).
I am not sure there are any low-level docs about how signals and workflow tasks interact (signals are not much different than any other event).
Is a signal always accompanied by a workflow task when looking at the history? And there can be multiple signals with a single workflow task that will execute all of the signals?
Ok I think I’m understanding where I was confused. Let me make sure.
A signal can come to temporal server at any time during a workflow execution, but it only gets picked up by the worker during the first step of the event loop tick. So if a signal arrives at the temporal server after ‘Workflow Task Started’ and before ‘Workflow Task Complete’, it only shows up in the worker (and in the history) once the workflow task completes and the event loop goes back to step 1 (Process Signals and Updates until blocked).
Now at step 1 of the event loop the worker receives any pending signals/updates from the server and processes them first as part of a workflow task. As a part of the same workflow task, it will also try and progress the workflow (step 2 of event loop).
After any of these threads make progress, all await conditions will be re-evaluated. If re-evaluation causes any of these threads to unblock, they will be progressed immediately. Even if it is a signal that is unblocked, it will not wait until step 1 of the event loop to process it since it is an existing signal. It will immediately continue the coroutine until it is blocked again.
Mostly correct, though wouldn’t think of it as “picked up during first step of the event loop tick”. Rather, the event loop only runs as the result of a task from the server, and it runs until all yielded (including wait conditions).
Ok understood. The event loop step doesn’t actually pull tasks from the server it gets invoked as a result of tasks from the server.
I just want to be doubly sure on 2 things:
Once the event loop is invoked with ‘N’ signals/updates, any new signals/updates will not run until next event loop tick.
Any signal/update whose await condition changes during the current event loop tick will get continue before the end of the current event loop tick and before the new signals/updates start to process.
I would rephrase this as “once the task is received with N signals/updates, any new signals/updates will not run until the next task (except in certain workflow completion situations where it may fail-then-retry the final task)”. The event loop has nothing to do with this part, it’s just an implementation detail of task processing.
Kinda. A simpler way of saying it is “a workflow task will not complete successfully while any wait conditions evaluate to true”.