Long-Living Workflow Architecture in Handling Frontend Requests

Hello, everyone. I am seeking advice on how to best build an application architecture when dealing with a long-living workflow and a gateway that takes requests from the frontend and interacts with the workflow.

In our current implementation, we operate as follows: we have short-living workflows for each action. For example, when we call the order creation handler, it sends a request to start the workflow that creates the order and ends, delivering the result to the frontend. This scheme is similar for adding products and other minor changes. However, we have some actions that can take a bit longer (around a minute), for instance, the order processing because we are waiting for completion from external systems.

In such cases, we call the reservation handler that starts the reservation workflow. But it doesn’t wait for the result; instead, it immediately returns the Workflow ID, thereby not blocking the frontend, which hangs a loader in response. We have specific logic on the frontend so that after a certain period, we go to a particular handler that retrieves the state of this workflow by its ID and its result. If we get an error or success, it returns the result back to the frontend, and we cut off the loader.

However, I’m uncertain how to implement this in a long-living workflow. For instance, now we have a handler, and we started a long-living workflow, and first, we need to get an orderID. We can’t afford to block, but we need to somehow get the order ID. So right after the order creation activity in the database, we make a Query handler to get the orderID. This part is fine, but what about later steps, like reservation? We refer to the handler, it sends a signal to our long-living workflow, and we merely get confirmation that the signal was sent, and the frontend makes a loader. But what next? Do we need to create a Query handler that will return the state, such as isRunning: true and other data, following this signal? I am worried that the query might not be fast enough, especially considering that our history can grow, and we always need to poll several queries so that the frontend knows whether it needs to make a loader or not.

How much longer would a Query handler take to execute and process a request than simply checking whether the workflow has finished or not through GRPC in Go via the SDK? Is there a different approach to such tasks? We might have many signals, and some may also take up to a minute or two, and we don’t want to block but don’t want to hang on Query handlers either.

And what about an Update Handler for changing state? Unlike a signal, it doesn’t let us instantly get the result of whether it was sent or not; it waits for a response. Is there a possibility to use it too in our case, or should we stick to signals only?

For instance, now we have a handler, and we started a long-living workflow, and first, we need to get an orderID. We can’t afford to block, but we need to somehow get the order ID. So right after the order creation activity in the database, we make a Query handler to get the orderID.

You assign workflow id on creation. So you can construct whatever ID that doesn’t require lookup.

what about later steps, like reservation? We refer to the handler, it sends a signal to our long-living workflow, and we merely get confirmation that the signal was sent, and the frontend makes a loader. But what next?

I would use update handlers for this.

Thank you for your reply

What I mean is, I need to get the OrderID, not the workflow ID, and I can only get the OrderID after a record has been made in another service’s database.

But my question was precisely about this: how not to get blocked, because the update handler blocks and waits for a call, isn’t that so? I can’t afford to block the response for the front end.

Perhaps I didn’t write it in enough detail. Would you like me to provide more information?

What I mean is, I need to get the OrderID, not the workflow ID, and I can only get the OrderID after a record has been made in another service’s database.

I see. There are a few options in this case.

  1. The ideal solution is two generate OrderID before staring a workflow and use it as WorkflowID. It assumes that “another service’s database” can accept it instead of generating its own.
  2. Make record in DB and then start the workflow.
  3. Start a child workflow with the WorkfllowID=OrderID.

Thank you, Maxim. Do you mean such a flow?

  1. I go to the database, create an order there, and immediately start the main long-lived Workflow with ID equal to OrderID.
  2. Then the long-lived workflow waits for a signal to process the order and as soon as it is received from the gateway, this long-lived workflow itself starts a child Workflow with some constant ID, for example orderID_processing for this action and therefore the gateway does not have to wait for anything from the long-lived workflow, it can pass control back to the front.
  3. The front can then refer to the gateway handler, passing the orderID and the gateway will simply get the state not of the long-lived workflow, but of the child one, because its ID is known and easily collected as orderID_processing.

Did I understand you correctly? But I still didn’t understand why we need an update here and why you suggested it? Isn’t it an experimental feature?

But if we use it, how would that look like?

  1. Instead of a signal, I send an Update to the update handler, start a child workflow without waiting for completion with the name orderID_processing and return in the response of the update handler something like true if everything started successfully? And then somewhere below in the long-lived workflow, I wait for the completion of the child workflow.
  2. The front can then refer to the gateway handler, passing the orderID and the gateway will simply get the state not of the long-lived workflow, but of the child one, because its ID is known and easily collected as orderID_processing.

Additional questions:

  1. But I still didn’t understand why we need an update here and why you suggested it? Isn’t it an experimental feature?
  2. If the flows that I described are correct, how is it best to ensure that orderID_processing is determined only in one place, otherwise it turns out that the long-lived workflow itself must know the generation rule - orderID_processing and the gateway the same and when changing it you will have to change it in 2 places. So far, I see the idea only with the signal or update Handler to pass the ID itself from the gateway, so that the long-lived workflow simply take it and start the child workflow with it.

If all you want is to notify workflow about the next step without getting any information back, then an update is not needed and a signal is fine.

Thank you so much! Could you provide more information about my questions and the steps I’ve outlined, please? Are they correct?

I"m not sure why you need a child workflow. I think you can just have a single workflow with WorkflowID equal to OrderID.

I want the booking order and other actions to be executed within the main long-running workflow, and for child workflows to be linked with the parent one. If I simply launch these as independent workflows, they will have no connection with the long-running one, and will probably have to communicate with it via signals if it’s necessary to pass along the result

1 Like