I’m curious if there’s ever been any discussion around supporting a fallback mode for temporal clients in the case of failure/outage with the temporal backend?
Say we have a scenario where a new workflow request is initiated in the client, but for whatever reason the gRPC connection is broken, or the backend is unresponsive.
The primary recover benefits for a temporal workflow can only be obtained if the workflow was ever registered before the outage. So new workflow requests are effectively hard down during this time.
This leads me to start wondering if it would be at all possible to allow the client to enter into a failure mode where the workflows and activities could be carried out, but contained entirely on the host receiving the request. In this mode the typical orchestration and communication to the backend service would be ignored.
This obviously comes with the tradeoffs that the single host is now at risk of its own failure with no recovery options, but at the same time it would allow workflows to possibly be processed and transition the outage from hard down to a functioning but degraded state.