The problem of parent workflow waiting for child workflow


I have a virtual machine scale out process. I will call the virtual machine platform to create the machine, and then call initialization process.

But the virtual machine platform returns one by one. I will create an initialization process for each machine, but I want to wait for the initialization process of all machines to end in the scale out process. What’s the best practice?

I’m not sure I fully understand the question. All SDKs support async invocation of child workflows and activities and waiting for all the futures/promises to complete. Which SDK are you planning to use?

What is the reason in your diagram that you use activity to create a machine and then start “Server Init” workflows on callback? To me the more natural flow would be starting a child workflow per server and let the child workflow handle all the activities related to that server including “create machine” one.

func workflow(ctx workflow.Context, ...) error {
	
	selector := workflow.NewSelector(ctx)
	errors := []error{}

	createInitActivity := func(index int) {
		activityCtx := workflow.WithActivityOptions(ctx, newActivityOptions())

		future := workflow.ExecuteActivity(activityCtx, createInitActivity)
		selector.AddFuture(future, func(f workflow.Future) {
			errors[index] = f.Get(activityCtx, nil)
		})
	}

	for index := 0; index < 100; index++ {
		createInitActivity(index)
	}
	for _, err := range errors {
		if err != nil {
			workflow.GetLogger(ctx).Info("error countered", zap.Error(err))
			return err
		}
	}

	... other logic here 
}

func createInitActivity(ctx context.Context, ...) error {
	// create the machine
	// initialization
}

This is my current process. If creating a machine fails, how does temporary save the state

func ScaleOut(ctx workflow.Context, count int) (ScaleResult, error) {
	var (
		err         error
		workflowCtx = workflow.WithActivityOptions(ctx, defaultTaskOptions)
		logger      = workflow.GetLogger(workflowCtx)
	)
	var result = ScaleResult{
		Success: make([]string, 0),
		Failure: make([]string, 0),
		Errors:  make([]string, 0),
	}

	// batch creation of virtual machines
    // response : {"success": true, servers: ["1.1.1.1", "1.1.1.2"]}
	var response components.CreateServerResponse
	if err = workflow.ExecuteActivity(workflowCtx, CreateVMServers, count).Get(workflowCtx, &response); err != nil {
		result.Errors = append(result.Errors, err.Error())
		logger.Error("scale out error", "error", err)
		return result, err
	}

	if !response.Success {
		result.Errors = append(result.Errors, fmt.Sprintf("create vm failed [%s]", response.RequestID))
		return result, nil
	}

	selector := workflow.NewSelector(workflowCtx)
	childCtx, _ := workflow.WithCancel(ctx)
	for _, server := range response.servers {
        // create server init flow for each server
		f := workflow.ExecuteChildWorkflow(childCtx, ServerInit, server)
		selector.AddFuture(f, func(f workflow.Future) {
			var resp ScaleServerResult
			err := f.Get(childCtx, &resp)
			if err != nil {
				logger.Error("linux server init error", "error", err.Error())
				return
			}
			if resp.Success {
				result.Success = append(result.Success, resp.IP)
				return
			}
			result.Failure = append(result.Failure, resp.ServiceTag)
			if resp.Message != "" {
				result.Errors = append(result.Errors, resp.Message)
			}
		})
	}
    // wait all server finished
	for i := 0; i < response.Count(); i++ {
		selector.Select(childCtx)
	}
	return result, nil
}

If creating a machine fails, how does temporary save the state

All the state including local variables and blocking calls is automatically preserved in the DB Temporal service relies on. So you don’t need to do any state preservation in your application explicitly.

I don’t understand the meaning of the following code:

	childCtx, _ := workflow.WithCancel(ctx)

dead code

Failure in the process of creating a machine, such as expecting 100, only creating to 50