Design for coordinator workflow with potentially large history

Also, what are the implications of having more than 50000 history events? I didn’t see it being configurable, so there must be some restrictions on your side.

The recovery time of a workflow gets longer and longer with history size. In some situations, the frontends can run out of memory if history is too large (which should eventually be fixed).

I recommend the following workaround. Do not rely on the childFuture to get notified about the child’s completion (until the issue #680 is implemented) as it doesn’t play nice with continue-as-new. Instead, the children can use a signal to report its completion to the parent by the parent WorkflowID. In this case, the parent can call continue-as-new and still wait for all its children’s completion in the form of signals. Make sure to start the children asynchronously for them to continue executing after the parent’s continue-as-new call.

1 Like