Is it possible or sane to have one worker handle one (or at least a few) workflows at a time?

So I’m using the python SDK, and I’m not sure if its something that makes sense… but I’m curious if theres a way to have temporal workers pick up and handle a single workflow from start to finish before taking on another - or some variation of this?

The reason I ask is, we create one workflow per “job” - in our case, a job is about 5 activities, each that progress synchronously one after the other, to process a call recording file and ultimately insert a bunch of extra data into a database. These “jobs” can take anywhere from 1-20 minutes depending on the size and complexity. We’re currently trying to process the past few months worth of data, so we load, say 50k jobs in, each their own workflow. When the workers start picking up activities to accomplish, it’s seemingly semi-randomly picking up the first activity across all of the workflows, then progresses through each activity across the entire workflow queue.

As a result, it takes a very long time before we have any single workflow completed, but /many/ jobs are 30-60% done all at the same time. I think I understand why this is happening, but I’m curious if there are other strategies that would help avoid this outcome. The only strategy I’ve come up with so far is to deliberately only put 5000 jobs in at a time and wait for them to finish before queueing the rest, but maybe there’s a better way?

I don’t know the answer to your question, but, conceptually at least, I don’t think the best approach generally is to start workflows that you then don’t have the workflow workers run. If you’re resource constrained and want to limit the number of jobs that’s you’re running at the same time, consider making that part of your workflow logic.

For example, you could have a coordinator workflow that would keep track of how many jobs were proceeding simultaneously. When a job workflow started, the first thing it would do is signal the coordinator workflow, and then wait for a response. The coordinator workflow would signal back immediately if there were currently less than 5,000 jobs running, otherwise it would queue the request. When a job workflow finished it would signal the coordinator workflow, and if it had any jobs queued it would signal a waiting workflow to proceed.