Below is a somewhat contrived example “mega workflow.” This workflow contains multiple child workflows, some of which (A and B, D and E) can occur in parallel. Some workflows (C and F) depend on upstream workflows being completed. Data is generated during child workflows and needs to be passed downstream. As another wrinkle, child workflow F should automatically be started after D finishes. After completing workflow E, a user may choose whether or not to run F a second/multiple times more.
Each child workflow is its own well-defined unit which may depend on data having been generated upstream.
I’m trying to understand the following:
- when should a workflow contain multiple child workflows vs. all of these workflows being independent and something else determining when to kick things off?
- activities may call an endpoint
A
which generates and returns data. Some of this data may be required in downstream tasks/child workflows/other independent workflows. When should the workflow take data that is returned from an activity (is this possible?) and shuffle it around to other tasks/workflows and when should downstream tasks/workflows have activities for getting that data another way (i.e. an activity which makes a separate API call to get data which was generated from endpointA
)?
- how do you draw the line between business logic in the workflow code vs. in external systems? For example, let’s say multiple conditions must be met before starting child workflow A. When should that logic be in the workflow vs. an external system that knows when the conditions are met and signals A to start?
Please let me know if these questions do not make any sense, and thank you for the help!