At my company we typically use temporal workflows to make sure a set of steps that involve multiple services eventually finish. Ill give an example to make it clear.
We activate a internet subscription for a customer, this typically involves some steps like this:
send some received your order email => email service
do some network work to actually activate internet for the customer => provisioning service
send some activated email => email service again
Now back to the problem, typically this involves creating a subscription in our database and then using the workflow to move it from created to activated fulfilling above steps. The problem for me is that its not clear what is good design here.
These are the ways we do it
Persist a subscription in the database, when its persisted, start the workflow to handle the activation-flow (this creates the issue that if for example temporal is down, we will have a created subscription but no workflow running)
Start a workflow, wait for the subscription to be created, then spawn a childworkflow that we orphan, i.e dont wait for and return to the caller the persisted subscription (this works but it becomes hard to write tests for this)
Are there any obvious way todo something like this ?
I read some suggestions that you can use the temporal workflow as the actual data source but I dont like this since it limits search and stuff to what temporal can do. I want to have our database as the source of truth and only use temporal to ensure complex logic eventually finish.
This seems like a typical use-case to me, but I suspect you have some unspecified requirement for async completion perhaps? Ignoring this for a second, why not just embrace temporal fully in a simple way and have one workflow that runs start to finish on creating the record in the db, sending the email, invoke the provisioning api, and sending the final email? Temporal is really good at this type of situation.
Db could be having intermittent issues as well, in which case you may not have a record of a subscription at all. Temporal SDKs have default api call retries which you can configure, also allow your clients to detect failures reaching Temporal service. In case of continuous failure to start execution, you can handle it in your client code and log subscription info so can backfill it later when your services are back up.
From your client, start the workflow asynchronously (don’t wait for workflow completion in the client).
Then, in the client, either
a) poll the workflow with a query to return whether the subscription has been created yet; or,
b) synchronously call an Update on the workflow, which in this case wouldn’t execute anything in the workflow, but unlike a Query an Update can block; so you would have the Update handler block until the subscription had been created. Then the client’s call to the Update would return when the subscription had been created.
I actually have a branch that basically does this now. Creates it in the db, then starts a childworkflow which is orphaned and then returns the created record to the caller, making the api call not very long. But Im not sure its good design.