Maintaining stickiness at an activity execution level

emkay · October 16, 2024, 6:23am

I have following use case.
In one of the activities of my workflow, I run an iteration for X times. For each iteration I spawn one workflow that has one activity. This activity retrieves Data from sql db via an expensive query, does some computation. Activity completes, workflow closed.
In secondary iteration I again spawn the workflow and activity. Fetch the data and do some computation and so on.

I am looking for a way via which I can maintain the stickiness of my Activity execution on a single pod/worker, so that I can re use the data fetched from db in first iteration for all the iterations.
There can be N workflow + activity spawned in each iteration.

maxim · October 16, 2024, 3:22pm

Use a host-specific task queue to route activities to a particular host. See samples-go/worker-specific-task-queues at main · temporalio/samples-go · GitHub

emkay · October 17, 2024, 12:50am

I saw the example. What I understood is, activity to fetch the file data might be running on one of the workers. Then a new queue with unique uuid was created and workflow was started, listening to that queue, so my current worker becomes the only listener to that queue. which would spawn all related activity on the same worker.

For my case, since I have N workflow + activity to be spawned, if I do it on a single worker, it will lead to memory issues.

What I have done currently is, on pods startup, I create a unique queue basis the host name of machine. They are saved in redis. When running iteration I map one workflow to one of the registered queue name. So that for given workflow same pod is picked up.

Here I will come across a problem when a pod goes down, I will have to rebalance my mapping for workflow to registered queue names.
I am looking if I can use something simpler to get the same result.

awwx · October 17, 2024, 12:59am

Could you use a cache service such as Memcached? That way you could cache the result of the expensive query, without the complexity of trying to load balance across sticky activities.

maxim · October 17, 2024, 1:04am

For my case, since I have N workflow + activity to be spawned, if I do it on a single worker, it will lead to memory issues.

Would you explain your use case in more detail? Do you have multiple partitions of data fetched from DB? Do you have a sequence of activities that needs to iterate of each partition of data? Do you need to guarantee that only one sequence is executed per worker at a time?

emkay · October 17, 2024, 1:50am

This is my use case.

Activity 2, runs iteration. say upto 45. In each iteration, it runs another iterations say upto 100, these 100 workflows spawned are to be executed in parallel. Each workflow is fed some unique input ids, basis which it fetches the data from the DB

Once workflows/activities are executed, output of these activities are put in cache, so that they can be used in next iteration. It’s sort of map reduce usecase

For each iteration; for 100 workflow, I fetch the data from DB. Data is very large to be saved in redis/memcache. So i am figuring out, if I can run the workflow at jth position on single machine for all 45 iterations? so that data kept in memory on that machine for that element can be re used.

maxim · October 17, 2024, 6:00pm

So, the main issue with the naive “worker-specific-task-queue” approach is the inability to keep a slot at a given machine for the whole workflow duration. If you just limit the number of activity slots at the process specific task queue, then all workflows that use that host will have their activities intermingled.

Is my understanding correct, or am I still missing something?

Topic		Replies	Views
How to make activity use the same worker? Community Support workflow-options	10	2091	October 4, 2024
PoC: Dynamic Activities and one worker per activity Community Support general-impl , kubernetes	3	990	January 25, 2022
Workflows Remain Stuck on the Same Worker Pod After ContinueAsNew and Fail with StartToCloseTimeout Community Support child-workflow , continue-as-new , php-sdk , kubernetes	2	51	November 29, 2024
Activity and task queue behavior after workflow termination Community Support activity	0	163	March 27, 2024
How can we build a web crawler like workflow? Community Support use-case-validation	3	1006	January 21, 2022

Maintaining stickiness at an activity execution level

Related topics