Difficulty throttling worker, overwhelming up/downstream components

Hi all,

I have a workflow that fans out several subsequent, non-child workflows as it runs. These workflows can number in tens to hundreds of thousands depending on the input. The same Spring Boot service contains workers to handle both types.

I am trying to throttle the fanned out workflows from not all firing immediately, but my worker tries to start them all as fast as possible.

I am running into two issues as a result:

  • I am overwhelming the Temporal server services as they are currently configured and saturating the backing database.
  • I am saturating the CPU available to my K8s node group and causing issues for other services.

I do not have control over the Temporal server setup or the K8s cluster, so I need to throttle my worker, but none of the below seems to function as I expected. I’ve tried combinations as well as all of them at once, yet my worker still fires workflows like crazy. How can I keep it so my worker only executes a few of the fanned out workflows at a time?

WorkerOptions.newBuilder()
                        .setMaxConcurrentActivityExecutionSize(1)
                        .setMaxWorkerActivitiesPerSecond(1)
                        .setMaxConcurrentWorkflowTaskExecutionSize(1)
                        .setMaxTaskQueueActivitiesPerSecond(3) // supposed to affect all workers
                        .build());

Limiting parallelism for executing workflows is not currently supported out of the box. I don’t understand your requirements, but such request frequently arises in the context of batch jobs. See the batch examples from the Java SDK.

Thanks for your response - I may have omitted an important detail. The fanned out workflows depend on data from another service, which I set up as a child workflow and limit the number of concurrent activities of the associated worker.

So, the first thing each fanned out workflow does is kick off a child workflow. Those child workflows get created rather fast, which is what leads to the Temporal server getting overwhelmed.

What I’m really trying to do is cap the CPU that my service takes up, since it seems to just consume as much as is available. Maybe that’s something I need to tackle at another level, like limiting what’s available to my pod in K8s.

I would rate limit the child workflow creation.