Handling large amounts of incoming signals in a workflow

Hi,

We have a use case, where we would like to use a temporal workflow to process completion signals for various other workflows throughout our app. We have some question on whether this is reasonable to do and some implementation details.

Each workflow will send a maximum one signal per run, so there should not be a problem on the sender side relating to the History Limit.

We are wondering how to handle this on the receiver side, as it might receive many signals. For each signal, we kick off an activity. The workflow body itself is essentially empty, as it only waits for signals in a loop.

The open questions are:

  1. how to handle the history limit and replays in case of crashes. We do not want the workflow to reprocess signals (execute activities) it has already seen
  2. In case a limit is reached, what’s the best way to handle this or let it terminate, and the new instance will handle the new signals.

For 2 at least, I guess we can keep a variable for signals received, update it and in case it’s more than X start as new. Is this reasonable?

Any guidance would really be appreciated, thanks!

How much is the “large amount”, and what is the maximum rate (signals/second) of signals sent to a single workflow instance?

I think it’s not actually that large. I think we expected a few hundred per hour at a maximum.

Then this should work fine. Answers to your questions:

  1. I don’t understand how history limits are related to crashes. Workflows don’t reprocess activities in case of worker crashes. That is the purpose of recording activity results in the history.

  2. Don’t wait for the limit. Call continue-as-new after a few hundreds of signals.

Thanks for the clarification, my concern was:

  1. The signals will not be reprocessed in this case correct? The concern here was if the workflow reached the limit.
  2. Perfect sounds good
  1. Signals are not reprocessed. You have to call continue-as-new after some number of signals to ensure that workflow doesn’t reach the limit. But it is not linked to failure recovery in any way.