Few points are still not clear with regards to over all architecture:
We can have multiple workflows and activities listening to the same task queues. So, what is the benefit we gain by splitting these workflows into independent task queues and workers ? Is it purely based on scaling needs of individual workflows ?
Consider we have task queue for individual workflows/activties it will become complex to maintain so many workers as we scale. Thus, we were thinking of running multiple worker threads with in same process. Trade off here being this will introduce a tight coupling in terms of scaling horizontally. Is this considered an anti-pattern ?
When do we split different activities into individual task queues ? One use-case that made sense to us was throttling activities at a global scale when a downstream system is down and pausing a workflow. Is there any other valid use-case apart from this ?
Will moving an activity to different task queue require workflow versioning ?
We can have multiple workflows and activities listening to the same task queues. So, what is the benefit we gain by splitting these workflows into independent task queues and workers ? Is it purely based on scaling needs of individual workflows ?
Think about task queues as service endpoints and workers as pools of threads that process requests to those endpoints. So if you want to have a single process with a single thread pool to process your workflows and activities then there is no need to have more than one task queue. Task queues are scalable, so there is no need to add queues for scalability reasons.
So if you want to scale some activities independently or you want a different team to own some workflows and activities then use a separate task queue for them.
Consider we have task queue for individual workflows/activties it will become complex to maintain so many workers as we scale. Thus, we were thinking of running multiple worker threads with in same process. Trade off here being this will introduce a tight coupling in terms of scaling horizontally. Is this considered an anti-pattern ?
Yes, this is anti-pattern as having many thread pools in a single process makes optimizing resource utilization very hard.
When do we split different activities into individual task queues ? One use-case that made sense to us was throttling activities at a global scale when a downstream system is down and pausing a workflow. Is there any other valid use-case apart from this ?
Besides moving them into a separate pool of processes the rate limiting is a valid use case. Temporal supports both per worker and global (for the entire task queue independently on the number of workers) throttling.
Will moving an activity to different task queue require workflow versioning ?
No. ActivityOptions are not checked during replay. So the task queue can be changed without the need for versioning.
Pardon me If I am wrong @maxim, I was going through the latest documentation. Isn’t the example mentioned here contradict this statement?
For example, using an SDK’s “Execute Activity” API generates the ScheduleActivityTask Command. When this API is called upon re-execution, that Command is compared with the Event that is in the same location within the sequence. The Event in the sequence must be an ActivityTaskScheduled Event, where the Activity Name and the Task Queue name are the same as what is in the Command.
If a generated Command doesn’t match what it needs to in the existing Event History, then the Workflow Execution returns a non-deterministic error.
Thanks for catching the discrepancy. The documentation should be fixed. Here is the Go SDK code that performs the comparison. Note that the strict argument is hardcoded to false.