Using signals to build a WorkflowQueue with high throughput

Hi, Temporal community-

I have a “best practices” question around the signals, workflow queue, and continue-as-new pattern that I see suggested often in threads here.

I have a long-running workflow that polls a DB for messages, parses them, and then routes them to external services in batches. I have potentially thousands of instances of this workflow at any given time, each of them with a deterministic/stable workflow ID of: tenant ID + downstream service name. My workflow uses the same continue-as-new pattern as Workflow.await(), does it release thread? - #2 by maxim where it limits how many messages are processed by the workflow, and then it passes any remaining queue items as a workflow argument. But it differs from that example in that it processes all messages in the WorkflowQueue at once in a batch, rather than taking an action per message.

I’m investigating the feasibility of eliminating the DB and having this workflow receive all of its messages via signals. I’m concerned about the signal threading pattern described in @SignalMethod threading configuration · Issue #214 · temporalio/sdk-java · GitHub and how I’m not sure I can throttle the growth of the WorkflowQueue. My understanding is the Java SDK will receive signals in separate threads and therefore concurrently, unlike the Go SDK with its Go channels. Right now I’m able to limit how many messages are in the WorkflowQueue based on when I fetch from the DB and how much is fetched.

The messages that I want to signal to the workflow usually come in bursts, where an estimated peak is ~33k/min, and each message payload is ~1.5KB. With a message payload of that size, I would want to keep an iteration count fairly low before calling continue-as-new.

In order to keep the WorkflowQueue size low, is there a way I can control when and how signals get received? Is there a way to “pause” receiving signals if the queue is too large, until after the workflow is continued-as-new? Are signals the right pattern here, or should I stick with my DB?

Thank you so much!

1 Like

What is the maximum expected signal rate per workflow execution? Temporal scales with number of workflow executions, but a single workflow execution is not intended for high processing rate.

I appreciate your fast response @maxim , I’m sorry I lost the thread here.

The estimate was ~33k signals/min previously, and after re-reviewing metrics that still holds true. That is the peak rate, though, and it tends to come at certain times of the day such as noon, or at the beginning of each hour. The sustained rate is <500 signals/min.

But based on my understanding of the Java SDK signal lifecycle, I would have to design around the burst, or else the workflow history would exceed the max size.

How many workflow instances will receive these signals? You don’t wan to exceed a few signals per second per workflow.