Batch Processing vs Multiple Workflows

maxim · March 10, 2021, 5:57pm

Option 1 Pro’s & Cons

Con: Cannot process multiple objects in parallel

You can run multiple parallel activities for different parts of the file. For example for a large file stored in S3 you could download parts of it independently to different hosts.

Option 3 Pro’s & Cons
3. Con: cannot take advantage of any bulk operations e.g. bulk saving of objects.

There are ways to buffer events across multiple activity invocations. For example, accumulate results from many activities on a worker and complete all of these activities asynchronsouly after a bulk operation is done.

Scenarios

Option 1

This is the simplest approach if processing each record of the file is simple and short-lived. An activity implementation can process multiple records in parallel if it helps to speed things up.

Option 2

This approach is still useful if the size of the file is bounded as it is the simplest one.

There is a variation of this approach that works with files of unlimited size which I call iterator workflow. The idea is to process a part of the file and then call continue as new to continue processing. This way each run of a workflow that processes a range of records is bounded in size. This approach also works if each record requires a child workflow for processing.

Option 3

The child workflow per record option is needed if each record requires independent orchestration which can take an unpredictable amount of time. If the number of records in the file is large the options are either use the iterator workflow approach I described in Option 2 or use hierarchical workflows. For example, a parent workflow with 1000 children and each of them with 1000 children allows starting 1 million workflows without hitting any worklfow size limits.

Topic		Replies	Views
Best Practices for Implementing a Workflow to Process Millions of Files Concurrently with Heavy Child Workflow Activities Community Support go-sdk	10	768	August 11, 2024
Splitting one message into couple smaller ones (Splitter + Aggregator) Community Support	2	379	September 16, 2023
Temporal workflows Developer Corner	2	28	January 7, 2025
Getting started with workflow concepts (parallel activities tracking) Community Support tracing	2	4394	January 3, 2022
Parallell processing should be activities or workers? Community Support	2	488	August 24, 2023

Batch Processing vs Multiple Workflows

Scenarios

Option 1

Option 2

Option 3

Related topics