I am developing an API service that will be using Temporal. My setup so far is having one workflow per endpoint, with each workflow containing one or more activities depending on the complexity of the business logic in the endpoint.
One such workflow will have an activity that is essentially a mapreduce function - taking a massive dataset, splitting it into n number of segments (potentially thousands), performing a function on each of them in parallel, and then recombining them at the end. This will obviously require spawning many activity executions at once.
I wanted see what the best worker setup was for this architecture.
Currently, I just have a main process that starts up the HTTP server and creates one worker entity - the one that will have all workflows registered to it. It’s my understanding that one worker entity is sufficient for even a large number of workflows as they are lightweight.
Now for the activities. Would it make sense for each workflow to have its own worker entity for its group of activities? And as for the parallel job - can a single worker entity support a large number of concurrent activity executions? Or would it make sense to batch the n executions into x ephemeral worker entities/have one activity per worker entity/etc.? Not sure if it’s good practice to be dynamically spinning up/tearing down worker entities in this manner.
One such workflow will have an activity that is essentially a mapreduce function
To my knowledge server team is working on improving the batch execution support that will be added in the future. For best practices and ideas on implementing this see forum posts here (look for iterator pattern) and here (doing batching inside long-running activity).
Would it make sense for each workflow to have its own worker entity for its group of activities?
Workflow executions do not depend on specific worker. Having a worker per workflow type could lead to more complex deployment scenarios for your worker processes (as you would possibly have a large number of them). For activities you would look at your rate limiting requirements and see if any of your activities should run on their specific task queues if they require specific rate limiting (see forum post here for more info).