Child-workflows + Signals

Greetings. We have a question re. child-workflows. Imagine we have a parent workflow and we then launch N child-workflows. After launch, we want to send each child workflow at least 1x signal.

The instruction(s) to launch the child-workflows goes back in the workflow-task-response, but I don’t think the signals do. Ergo, if you don’t manage the sequencing, you can get into the situation where Temporal attempts to process the signal before the workflow has been created. It doesn’t happen all the time, but it does happen.

This seems logical and expected behavior. The question is: how do we deal with it? How do we know the child-workflow requests were sent to Temporal before sending the signals?

I think the solution is to force another execution of the parent-workflow. That way, you know:: the child-workflow requests went back to Temporal, it’s round-tripped back to the workflow, so now you can send the signals (from within an activity, obviously). A simple approach might be, for example, create the child workflows, then add a timer (forces a new workflow task?), then send the signals.

Are there any other ways this could be done?

Note: we had asked a similar question 1-2 weeks back, re. startWithSignal(..). That isn’t a viable solution here, as we can’t find an async form of it.

Related, from the parent-workflow, is there a concrete way of knowing that the child-workflows were successfully created, i.e. executed at least workflow-task? I mean, we can put a timer on the parent, then check all the child-workflow promises to see if they errored or not.

Many thanks!

Sean

Child workflow start is async, and you should wait for the the child to start before signaling it.
See this post on how to wait for it to start: Best way to create an async child workflow - #2 by maxim
and also Java SDK sample: samples-java/src/main/java/io/temporal/samples/asyncchild at master · temporalio/samples-java · GitHub

note that calling the get on the promise (childExecution.get() as in the sample) it waits for the child workflow to starts. You should be able to signal it after that returns successfully

Thanks. Though, unfortunate. That’s exactly what we are doing at the moment. We’ll re-file as a bug – with samples.

Greetings again. That sample, the childExecution.get(). Our understanding is that the promise should only complete when the child workflow has exited. In fact, that’s the behavior we see @ runtime. The sample is not conclusive.

Checking myself. We’re seeing a problem with 100s of child-workflows. Trying to build you a repeatable case. We’re seeing workflow-tasks timeout after started, suspect a performance problem in the replay logic.

Our understanding is that the promise should only complete when the child workflow has exited.

This should depend on the set ChildWorkflowOptions.ParentClosePolicy. If you set it to ABANDON the get returns when the child workflow is started, not completed.

Your understanding is incorrect. This promise is complete as soon as the child workflow has started.

The ParentClosePolicy applies only when a parent workflow completes. So it doesn’t really affect how the child workflow is started.