Usecase
Currently Temporal invokes SignalWithStartWorkflow to signal system archival workflows as shown below. And the signalTimeout is set to 300ms .
signalCtx, cancel := context.WithTimeout(context.Background(), signalTimeout)
defer cancel()
_, err := c.temporalClient.SignalWithStartWorkflow(signalCtx, workflowID, signalName, *request, workflowOptions, archivalWorkflowFnName, nil)
Observed behaviour
Signaling fails with the below error.
"msg":"failed to send signal to archival system workflow" "error":"context deadline exceeded"
However I’m able to see the signal in Temporal UI against the system archival workflows.
Question
Can this happen? Signal received by the workflow. But the caller didn’t seem to feel it was sent properly within 300ms?
The signal timeout for this call is indeed hard-coded to 300ms, see here and here .
"context deadline exceeded"
errors typically indicate errors caused by a timeout that happened, so my best guess is yes that timeout is causing the SignalWithStartWorkflow call to fail.
I would open an issue for this to make it configurable via dynamicconfig, or to at least move it up to 400ms so it for example matches the timeout set for parentClosePolicy .
Can it happen that the recipient workflow got the signal however the sender didn’t feel it was sent within 300 ms?
Asking this question as I’m obersiving this behaviour. Check this bug i opened > https://github.com/temporalio/temporal/issues/2464
Yes, got confirmation that currently the call may time out but system workflow can still get the signal. Thanks for raising that issue.
Watch for updates on your issue from the Server team, as they will look into fixing this when possible.
maxim
February 4, 2022, 9:42pm
6
AFAIK this is the property of any distributed system. An update can go through but the caller can get a timeout.