Thank you for the very quick reply!
Use case: The workflow will be started with 1 to 50 messages to be processed. Each message will be used as input for 1 to 3 activities (depends on the exact message content) and for signaling a workflow.
Nice to haves:
- Parallel processing of the messages. However, I couldn’t find information in the docs about how Temporal handles replaying events/activities if the order they happen is non-deterministic. I was wondering if “Process results locally, possibly using parallel threads” was related to how to handle non-deterministic ordering of activity execution. I’ve been looking into Flux for the parallel processing: Flux (reactor-core 3.4.5).
- If an activity for message N fails and the entire workflow must retry, messages 1 to N-1 wouldn’t have to be re-processed.
I’ll try to answer the questions you gave the previous poster, too.
- What is the maximum number of rows in the workflow input?
50 - What is the maximum size of the workflow input?
30kb - How much processing each row requires?
DB read/write, 0 to 2 API calls, and signaling a workflow - Does a row processing requires external API calls?
yes - What is the longest time a single row processing can take?
around 6 seconds, barring downstream dependency failures. - Is it OK to block processing if some row cannot be processed due to downstream dependency failures?
Yes, but if it possible to avoid blocking, I’d like to explore that solution. - How big is the output?
Very small, essentially just a “SUCCESS”
I am considering splitting the N messages into separate child workflows, so if a workflow has to restart due to activity failure, only N/(num child workflows) messages will be re-processed. But, from what I saw in the docs, this does not appear to be the intended use for child workflows.
I am thinking I may be overly concerned with activity failures causing the workflow to retry. My understanding is that this type of activity failure shouldn’t ever happen due to an intermittent downstream issue, meaning it would likely be caused by buggy code. In the buggy code case, the solution would be to upload a new version of the Activity code with the bug fixed, right? Assuming that this is done within the allowed retry period, the workflow wouldn’t have to retry, which means I don’t have to put much effort into planning around this type of failure.