Hello, everyone. I am seeking advice on how to best build an application architecture when dealing with a long-living workflow and a gateway that takes requests from the frontend and interacts with the workflow.
In our current implementation, we operate as follows: we have short-living workflows for each action. For example, when we call the order creation handler, it sends a request to start the workflow that creates the order and ends, delivering the result to the frontend. This scheme is similar for adding products and other minor changes. However, we have some actions that can take a bit longer (around a minute), for instance, the order processing because we are waiting for completion from external systems.
In such cases, we call the reservation handler that starts the reservation workflow. But it doesn’t wait for the result; instead, it immediately returns the Workflow ID, thereby not blocking the frontend, which hangs a loader in response. We have specific logic on the frontend so that after a certain period, we go to a particular handler that retrieves the state of this workflow by its ID and its result. If we get an error or success, it returns the result back to the frontend, and we cut off the loader.
However, I’m uncertain how to implement this in a long-living workflow. For instance, now we have a handler, and we started a long-living workflow, and first, we need to get an orderID. We can’t afford to block, but we need to somehow get the order ID. So right after the order creation activity in the database, we make a Query handler to get the orderID. This part is fine, but what about later steps, like reservation? We refer to the handler, it sends a signal to our long-living workflow, and we merely get confirmation that the signal was sent, and the frontend makes a loader. But what next? Do we need to create a Query handler that will return the state, such as isRunning: true and other data, following this signal? I am worried that the query might not be fast enough, especially considering that our history can grow, and we always need to poll several queries so that the frontend knows whether it needs to make a loader or not.
How much longer would a Query handler take to execute and process a request than simply checking whether the workflow has finished or not through GRPC in Go via the SDK? Is there a different approach to such tasks? We might have many signals, and some may also take up to a minute or two, and we don’t want to block but don’t want to hang on Query handlers either.
And what about an Update Handler for changing state? Unlike a signal, it doesn’t let us instantly get the result of whether it was sent or not; it waits for a response. Is there a possibility to use it too in our case, or should we stick to signals only?