Temporal for computationally intense burst workloads

dependencyinjection · September 25, 2022, 2:57pm

Hi! I just found out about temporal and have been pretty blown. It will likely cause a paradigm shift in development over the years. I want to get the benefits at the current company I work at, but I’m not sure if temporal is a good fit. We run highly GPU intensive video transcoding and 3d modeling jobs at a customers request (automated and triggered with a web API). Currently we are highly invested in aws step functions, aws lambda, as well as aws batch and serverless in general to keep our prices down, while being able to scale horizontally infinitely. Provisioning many GPU workers for temporal likely wont work for us and likely be wasteful given we don’t need the compute to be running 24/7. I’m going to assume for our use case, temporal isn’t a good fit. Am I wrong? We have many steps in our step function definitions, and managing the yml is becoming too much for us, so temporal would be a great solution. Are there other customers who have more burst-y serverless workloads similar to ours using temporal?

maxim · September 25, 2022, 5:58pm

I think Temporal is a perfect fit for your use case.

Provisioning many GPU workers for temporal likely wont work for us and likely be wasteful given we don’t need the compute to be running 24/7.

Temporal supports routing activity tasks to specific pools of workers or even individual workers. The way I would solve this problem is by having activities that provision the workers as part of your workflow definitions. Another option is to have a separate workflow that implements autoscaling of GPU workers.

dependencyinjection · September 25, 2022, 6:17pm

Thanks for the response. I’m still trying to understand the model. From what I understand, the workers need to be persistent to poll the queue. Is the thinking I could use a worker to spin up a GPU worker and add the work to the queue for it to pick up?

maxim · September 25, 2022, 6:19pm

Correct. You still need some always-running workers to host workflow code and activities that can control other worker processes. But I doubt that these would be expensive to run.

dependencyinjection · September 25, 2022, 6:45pm

Got it. And just to be clear, the new spin up GPU instances can have temporal code on them, and will be tracked?

maxim · September 25, 2022, 7:22pm

You are going to run so-called Temporal worker processes on these instances. A worker process is essentially a queue consumer that listens on a queue hosted inside the Temporal service using long poll gRPC requests. So Temporal doesn’t track those instances directly; they ask for work instead.

Topic		Replies	Views
Use-cases and questions Community Support	4	3068	January 5, 2021
Push, not pull, workers? Community Support general-impl , aws	3	1108	March 10, 2023
Worker Setup Recommendations Community Support worker	3	1720	August 26, 2020
How to run serverless workers that scale to 0 Community Support python-sdk , aws , temporal-cloud	6	676	October 17, 2024
Temporal instead of step funtions and Lambda for Infra provisioning Community Support	2	835	November 14, 2022

Temporal for computationally intense burst workloads

Related topics