Hi,
We have a use case that one activity is a long-running GPU-based ML process. We are running Temporal inside K8 cluster and the idea is to wrap worker executing this activity as K8 Job and use Keda ScaledJob for the autoscaling. Therefore, the worker should just pick one task, process it (potentially long-running) and then terminate.
How to achieve this (or something similar) in Temporal (python SDK)?
Right now, Temporal workers are meant to be always running. Scale-to-zero does not work well with Temporal workers at this time. There is not a good, reliable way today to know there is a backlog of pending activities across a namespace.
When you schedule the activity from the workflow, you can also orchestrate what is needed to start a worker to handle that activity. Most users run an activity on a worker (on the workflow task queue, not the resource-constrained task queue) that does whatever is needed to start the worker on the resource-constrained task queue to run the activity.
I’m also looking in something like this.
We are evaluating using temporal to run ML pipelines, including agents. We want the researcher to be able to launch a pipeline/workflow from their local computer. What I have been struggling with, is that the code and deps can be different for each workflow, and temporal doesn’t really have a solution for that. Thus docker comes in play, as it solves that problem. The local CLI would package the local code + deps into a docker image, and temporal would run that.
We could have temporal launch k8s jobs, but I wonder if there is any gotcha, or better built-in solutions for my use-case?
Cheers, Philippe