Hello Team,
For a running workflow, I have a query method which would return the state ( custom ones ) of the workflow, which is held in a property inside the workflow.
Upon clicking the Query tab in the webui for the given selected method, I am seeing many instances of the workflow are getting created and are not getting stopped ever.
In the workflow, I do have a loop, but there is a Workflow.await(() -> { return ... }
without any timeout for signals. Is it that when the query is called from UI, the events are replayed into the new workflow instance and then returned ?
If your execution is evicted from the worker cache (or worker restarted before you make query request) it can cause internal worker replay in order for worker be able to deliver query result to the caller.
I am seeing many instances of the workflow are getting created
Could you share your workflow code please? I am not sure how query / replay would cause workflow executions to be started.
I am not sure how query / replay would cause workflow executions to be started.
Sorry I did not put my point correctly.
What I meant was the workflow constructor is getting called many times for replay of the events and yes the workflow execution instances in the UI are not getting created
. I could see the many log statement Instance created TestWorkflowImpl@xxxx
.
Below is the snippet of the workflow and my current workflow also follows the same pattern of polling the new signal from the queue in a loop and exit the loop and workflow upon a certain end state.
public class TestWorkflowImpl implements TestWorkflow {
private String currentState;
private boolean exitWorkflow = false;
ActivityOptions options = ActivityOptions.newBuilder()
.setStartToCloseTimeout(Duration.ofSeconds(30))
.build();
private Queue<String> signalQueue = new ArrayDeque<>();
private Logger LOG = LoggerFactory.getLogger(TestWorkflowImpl.class);
public TestWorkflowImpl() {
LOG.info("Instance created {}", this);
}
@Override
public String execute() {
do {
Workflow.await(() -> {
return signalQueue.size() > 0;
});
String signal = signalQueue.peek();
if(lastSignal != null) {
LOG.info("[{}] Received signal [{}]", this, signal);
// perform some state machine logic by calling activities and updating new state
// to currentState property
}
signalQueue.poll();
LOG.info("[{}] Signal processed successfully [ signal = {}] ", this, signal);
} while (!exitWorkflow);
}
@Override
public void captureSignal(String name) {
this.signalQueue.add(name);
}
@Override
public String getCurrentState() {
return this.currentState;
}
}
Though I tried the same with this snippet and triggered workflow creation using signalWithStart
, but when calling query in UI I could see only couple of times the constructor is getting called ( but not infinitely ). Even if my actual worfklow impl is incorrect, it should not call the constructor again and again and I do see there are no exceptions propagated which would trigger the creation of the workflow instance again and again.
calling query in UI I could see only couple of times the constructor is getting called
This is i think expected behavior and has to do with worker replay. Your worker caches workflow executions that its processing. If cache is full worker evicts some execution from its cache in order to allow other executions to make progress (you can set worker cache configs in WorkerOptions).
If when you query this execution its no longer in worker cache (either it got evicted or you restarted your worker which would clear its in-memory cache) the impl would need to be re-created and then worker does internal replay which means running your workflow code again from top and comparing it to to the so-far recorded event history.
TLDR, during a workflow execution your workflow code can run multiple times, thats why its important that workflow code is deterministic
Also note, if you query a completed execution this would trigger worker replay each time and you would see this happen every time you query this completed exec. That’s why query latencies when querying completed execs are typically higher than querying running ones.
TLDR, during a workflow execution your workflow code can run multiple times, thats why its important that workflow code is deterministic
I could not see the instance creation getting stopped for few minutes ( after that I did exit the application, suspecting any issue in my workflow code). And also I am not seeing anything blocked in workflow and query result is immediate with though with many constructor calls.
Let me try that again and will check for how many minutes the workflow constructor is getting called, if its continuous, then I might have done something wrong in workflow.