Automatically Re-Queue workflow with same id

I have a workflow that does some long processing for an individual tenant. External events can trigger this processing but it should never run at the same time for the same tenant.

Previously we solved the requirement on the API layer. We create the workflow with the tenantId as workflowId, so an already running workflow throws an exception and we can pass that on to the API as a 409 CONFLICT.

However we now would like to queue this execution instead. So if the workflow for this tenant is already running, and a new API call comes in, a follow-up workflow should be scheduled but MUST NOT be executed before the current once exits.

One thought was to let the workflow check at the end if follow-up work is present and rerun itself (as new workflow, of course). But I think this might risk having a race condition where the new API call happens just when the workflow is wrapping up but has already checked if follow-up work is present. So the workflow creation would fail with duplicate (rightly so) but the workflow just missed the opportunity to get notified - meaning no follow-up will actually run.

Is there some existing mechanism for such a case?

So if the workflow for this tenant is already running, and a new API call comes in, a follow-up workflow should be scheduled but MUST NOT be executed before the current once exits.

You could have a long running workflow that receives these api calls via signals (signals are delivered in the order they are received). You workflow then can store these signals in the received order and for each one execute a sync child workflow. The workflow id reuse policy applies also to child workflows.
Not sure which SDK you are using but here is a small sample for Java that has this signal processing logic.

2 Likes

Ah, great idea! I hadn’t considered having a master-workflow at all. I’ll play around with that a bit. (And I am aware that I need to continueAsNew from time to time.)

Thanks!

@aksdb Some of your use case overlaps with what I was discussing here:

In particular, see “signalWithStart” and the guarantees that Temporal provides to prevent the signal from being lost.