2 docker containers each of which has its own Worker (python). Both of them are identical.
workflow is created from (java) application by calling newUntypedWorkflowStub
then after some time I want to know the status of running workflow by calling query method on the workflowStub (java)
Facing the following problem:
The workflow can be picked up by any (python) Worker. But when calling query, the Worker not executing workflow can pick up this query request, and return incorrect information.
Is there a way to force a Worker to be responsible for all queries that are coming to this workflow?
A workflow task’s state is the result of replaying its event history in a deterministic way. Querying a workflow is simply asking for some state at the end of the current history. To that end, it shouldnt matter which worker runs the workflow event history to handle the workflow query. If it does, your workflow tasks are not deterministic.
Said another way, workflow query results should not be tied to currently running activities, they should be computed based on all completed activities.
In that case, what can you suggest to the following -
Depending on input from java client, I execute logic (python) that during runtime (single activity) writes to a file. This execution can take quite a while. During execution, I want to see the content of that file from java client. (?)
Also, to this point -
Interestingly, I observe that when a (python) Worker is assigned to workflow, and when I execute query from java client - this query is ‘assigned’ to a ‘random’ Worker. But after it is assigned, all of the query requests that a java client does, goes to the same Worker, which it was assigned at the beginning.
Meaning, that if I query the workflow from java client, and I receive status updates, I will keep receiving them. If I didn’t receive it the first time, I will keep not receiving it.
The question is, if query is not connected to any Worker, and is simply asking for a state, why is it asking the state only from a single Worker?
For this reason, I assumed that if it is querying from a single Worker, I can assign a Worker to query at the beginning (?)
I suspect (but am not an expert) that there is some kind of pinning/favoring that occurs to optimise performance.
As for your use case if you need updates on the state mid-activity the ways I can think of to do that would be:
Use a state store (such as a database) to mark progress, and read from that state store either directly or in the Query. (Edit: I’m actually not sure if reading a state store on a Query violates the rules of determinism in workflow code…)
Chunk the logic into smaller activities that can update workflow state.
This set up was chosen so that python container does not interact with anything else but temporal client.
There is only 1 activity that does all the python logic
I tried this approach, but was not able to pass information from python to java. Don’t remember exact reason now, I believe because java client cannot read from heartbeat. (I might be wrong here)
It should not matter where a workflow runs. Workflows are deterministic and should have nothing different about them based on where they run. Activities that interact with the system or other external things it can matter, but not workflows. What is the code doing that could return different data for a workflow query on different workers?
The code while running does some operations, the results of which are written to a file. The problem arises because the file is specific to the docker container in which the Worker is running. The result file would be the same, for any container the code is running. But during execution of a workflow, only 1 docker container has this file…
Yes, the results are written by activity, however it is written to a local storage of the docker container. And the Worker from another docker container has no access to it. Thus when query is executed on a Worker, it reads the file from local storage, and depending which Worker is assigned the result is different - correct result or ‘no file found’
So it sounds like you just need to make sure that multiple activities go to the same worker so they can share disk contents. This is a normal use case we call “worker-specific task queues” and we have a sample for it. Basically if you need multiple activities to share the same worker, you need a task queue specific to the worker. So you ask all workers on the general task queue for a task queue name specific to them, then you use that task queue name specific to them for each activity in your set. This will ensure activities are sent to the same worker.