This ticket mentions a similar problem, but was fixed. (Edit: it’s ticket 1289, I’m not allowed to have >2 links.)
The doc as it stands today suggests that we do have to drain signals, and that “there must be a period of quiet time to allow the Continue-as-New to happen without Signal loss.” We’re using the Java SDK, and cannot guarantee any quiet time. It seems like the only alternative is to pass the contents of the signal queue to the next execution, but that seems a little crazy.
Have I misunderstood how this works? Is there a better solution?
You don’t have to pass signals to the next execution, assuming you drain the queue (if you decide to use a queue in Java workflow) properly. The main requirement is that workflow should not block between checking that the queue is empty and calling continue-as-new.
The way it is written is confusing. Quiet time is needed to ensure that workflow doesn’t hit the limit of allowed signals. After reaching the limit, new signals will be rejected until workflow calls continue as new. And it is not related to the SDK used.
The basic idea is that a single workflow is not designed to process a high constant rate of signals.
Thank you both. I’ve gone back to the original post as well and improved the wording to be clearer.
For reference - here’s the new wording:
It’s possible for Signals to come in faster than your Signal draining happens because a workflow is not designed to process a high constant rate of signals. In this case if a workflow is not able to call continueAsNew before hitting the max signal limit, new signals will be rejected until continueAsNew is executed.
Thanks! I just want to confirm one thing: the problem you describe can happen even if signals come in slower than your signal draining happens, right? E.g., if you’re draining at 10/sec but new signals are coming in at 5/sec, you can still hit the max signal limit, because it is a cumulative limit.
Thanks! I just want to confirm one thing: the problem you describe can happen even if signals come in slower than your signal draining happens, right? E.g., if you’re draining at 10/sec but new signals are coming in at 5/sec, you can still hit the max signal limit, because it is a cumulative limit.
It doesn’t depend on the rate directly. This can happen when the workflow cannot execute continue-as-new due to new signals constantly arriving while the workflow task that called continue-as-new is still running.