Hi, Temporal community-
I have a “best practices” question around the signals, workflow queue, and continue-as-new pattern that I see suggested often in threads here.
I have a long-running workflow that polls a DB for messages, parses them, and then routes them to external services in batches. I have potentially thousands of instances of this workflow at any given time, each of them with a deterministic/stable workflow ID of: tenant ID + downstream service name. My workflow uses the same continue-as-new pattern as Workflow.await(), does it release thread? - #2 by maxim where it limits how many messages are processed by the workflow, and then it passes any remaining queue items as a workflow argument. But it differs from that example in that it processes all messages in the WorkflowQueue at once in a batch, rather than taking an action per message.
I’m investigating the feasibility of eliminating the DB and having this workflow receive all of its messages via signals. I’m concerned about the signal threading pattern described in @SignalMethod threading configuration · Issue #214 · temporalio/sdk-java · GitHub and how I’m not sure I can throttle the growth of the WorkflowQueue. My understanding is the Java SDK will receive signals in separate threads and therefore concurrently, unlike the Go SDK with its Go channels. Right now I’m able to limit how many messages are in the WorkflowQueue based on when I fetch from the DB and how much is fetched.
The messages that I want to signal to the workflow usually come in bursts, where an estimated peak is ~33k/min, and each message payload is ~1.5KB. With a message payload of that size, I would want to keep an iteration count fairly low before calling continue-as-new.
In order to keep the WorkflowQueue size low, is there a way I can control when and how signals get received? Is there a way to “pause” receiving signals if the queue is too large, until after the workflow is continued-as-new? Are signals the right pattern here, or should I stick with my DB?
Thank you so much!