Custom ScheduleToStartTimeout for a ChildWorkflow

Hello,

We have a use case where a parent workflow starts a child workflow that runs on a different worker (i.e., it uses a separate task queue).

I need to ensure that if the child workflow does not start within X minutes after being scheduled, the system should terminate or close the child workflow.

What is the recommended way to implement this ScheduleToStartTimeout behavior for a child workflow Temporal?

I’m using Go SDK.

Any advice or example would be greatly appreciated!

Thanks in advance!

I need to ensure that if the child workflow does not start within X minutes after being scheduled, the system should terminate or close the child workflow.

There is a bit of nuance here maybe depending on what you mean by “started”. If your workflow.ExecuteChildWorkflowreturns error, then server could not ever start child workflow, potentially due to workflow id reuse policy or things like intermittent server/db issues. You can then in workflow code attempt to schedule the child workflow again if you want.

What I think you mean is if the first workflow task for the child execution is not dispatched to worker within X minutes, so in case there are no workers polling workflow tasks on this child task queue, or they too having intermittent issues, or are not provisioned to handle burst load (high schedule to start latencies for workflow tasks exceeding X minutes).

For this latter case I think you have two options. If duration of time X you mention is well above the typical execution time of this child workflow you could add WorkflowRunTimeoutin ChildWorkflowOptions. If its not, and your ExecuteChildWorkflow returns no error (so service started execution for child), you could in workflow code await a signal back (workflow.AwaitWithTimeout) with a duration of X and if child does not send signal back to parent within X, run an activity (can be local activity) that terminates the child. Cancellation in this case wont work because if your child workflow task queue workers are really down, cancellation request would never be able to be delivered to them.

The only other potential option that could think of is a “background workflow” that periodically checks running executions on child task queue, with child workflow type, and checks if “current time - workflow execution start time” > X and if so terminates those child workflows, which would then deliver child workflow failure to their parent execs.

Whats the exact use case for this scenario, can you give more info?