Maintaining stickiness at an activity execution level

I have following use case.
In one of the activities of my workflow, I run an iteration for X times. For each iteration I spawn one workflow that has one activity. This activity retrieves Data from sql db via an expensive query, does some computation. Activity completes, workflow closed.
In secondary iteration I again spawn the workflow and activity. Fetch the data and do some computation and so on.

I am looking for a way via which I can maintain the stickiness of my Activity execution on a single pod/worker, so that I can re use the data fetched from db in first iteration for all the iterations.
There can be N workflow + activity spawned in each iteration.

Use a host-specific task queue to route activities to a particular host. See samples-go/worker-specific-task-queues at main · temporalio/samples-go · GitHub

I saw the example. What I understood is, activity to fetch the file data might be running on one of the workers. Then a new queue with unique uuid was created and workflow was started, listening to that queue, so my current worker becomes the only listener to that queue. which would spawn all related activity on the same worker.

For my case, since I have N workflow + activity to be spawned, if I do it on a single worker, it will lead to memory issues.

What I have done currently is, on pods startup, I create a unique queue basis the host name of machine. They are saved in redis. When running iteration I map one workflow to one of the registered queue name. So that for given workflow same pod is picked up.

Here I will come across a problem when a pod goes down, I will have to rebalance my mapping for workflow to registered queue names.
I am looking if I can use something simpler to get the same result.

Could you use a cache service such as Memcached? That way you could cache the result of the expensive query, without the complexity of trying to load balance across sticky activities.

For my case, since I have N workflow + activity to be spawned, if I do it on a single worker, it will lead to memory issues.

Would you explain your use case in more detail? Do you have multiple partitions of data fetched from DB? Do you have a sequence of activities that needs to iterate of each partition of data? Do you need to guarantee that only one sequence is executed per worker at a time?

This is my use case.

Activity 2, runs iteration. say upto 45. In each iteration, it runs another iterations say upto 100, these 100 workflows spawned are to be executed in parallel. Each workflow is fed some unique input ids, basis which it fetches the data from the DB

Once workflows/activities are executed, output of these activities are put in cache, so that they can be used in next iteration. It’s sort of map reduce usecase

For each iteration; for 100 workflow, I fetch the data from DB. Data is very large to be saved in redis/memcache. So i am figuring out, if I can run the workflow at jth position on single machine for all 45 iterations? so that data kept in memory on that machine for that element can be re used.

So, the main issue with the naive “worker-specific-task-queue” approach is the inability to keep a slot at a given machine for the whole workflow duration. If you just limit the number of activity slots at the process specific task queue, then all workflows that use that host will have their activities intermingled.

Is my understanding correct, or am I still missing something?