I’m new to Temporal but find it very interesting. We are looking at using it for the next rollout of our platform scheduled for end of year so I’d like to get an idea of what’s the best way to architect it.
We have ‘spikes’ in demand sometimes - for example an influx of leads for which we need to perform a set of workflow steps. Would it be recommended to put Kafka as an incoming queue so we can handle spikes?
Per-tenant (a tenant is a customer - an outside organization or agency) we sometimes need to limit the number of leads flowing through the workflow at a given time. Does this mean I need to create a long-running workflow per tenant and send it leads through signals? Or should we use per-lead workflows and would there be some way to manage them?
For rate-limited tasks like sending emails we are thinking of making an async activity and pushing them onto an outbound Kafka queue. Is there a way to do this directly in Temporal instead?
The architecture we are thinking off looks like this:
(External Workflow Triggers) -> Inbound Kafka Queue
|
|
V
Temporal ---- (listens on Kafka and start workflow for each trigger)
|
|
V
Outbound Kafka Queue
Any inputs/suggestions would be appreciated. It would help us decide if Temporal is a good fit and how best to use it going forward.
It depends on the maximum rate you want to handle. Giving the number of boxes you would need for a Kafka cluster I wouldn’t be surprised that having Temporal cluster that can sustain the peak workflow start load is simpler and cheaper. What is the maximum start workflow rate you are targeting?
Yes, the simplest implementation would be to have a workflow per tenant.
No need for Kafka here. Use a separate activity task queue which already supports throttling limit.
It is about 200 requests a second. It is a pretty reasonable rate for Temporal to handle. So no need for Kafka here.
Temporal guarantees the uniqueness of workflows by their ID. So just use tenantId as workflowId and only one instance per tenant will be created.
Basically, when you start a worker object you pass the task queue name as a parameter to the factory method. Then when you schedule activity you would need to pass that name.
Consider using SignalWithStart to start per tenant workflows. It sends a signal to a workflow starting it if it is not running. It eliminates need to decide which event has to start the workflow.
AccountTransferWorkflow transferWorkflow =
workflowClient.newWorkflowStub(AccountTransferWorkflow.class, options);
// Signal with start sends a signal to a workflow starting it if not yet running
BatchRequest request = workflowClient.newSignalWithStartRequest();
request.add(transferWorkflow::deposit, to, BATCH_SIZE);
request.add(transferWorkflow::withdraw, from, reference, amountCents);
workflowClient.signalWithStart(request);
where AccountTransferWorklfow is defined as:
@WorkflowInterface
public interface AccountTransferWorkflow {
@WorkflowMethod
void deposit(String toAccountId, int batchSize);
@SignalMethod
void withdraw(String fromAccountId, String referenceId, int amountCents);
@QueryMethod
int getBalance();
@QueryMethod
int getCount();
}
And the code that handles the event would start/signal this workflow using signalWithStart() as shown in the moneybatch sample. Is that right?
Then we add rate limiting code to the main tenantWorkflow() method keeping track of the times each child workflow was launched and number of pending events.
@maxim In this scenario using SignalWithStart is a best practice but calling SignalWithStart will restart the workflow(say workflowId equals tenantId 123) even though workflow 123 was already terminated.
Let’s assume that a specific workflow(with workflowID 123) was terminated because the workflow has been executed successfully. If the kafak consumer continues receiving any messages associated with tenantId 123, which equals the workflowId, it will try restarting the workflow(with workflowId 123) again due to SignalWithStart behavior.
What’s the best practice to prevent this? how to avoid restarting that workflow if it has been successfully terminated if we wanna use SignalWithStart in the Kafka consumer?