Hello,
Temporal seems awesome for processing async background jobs/ workflows.
But we’re also evaluating temporal for processing a synchronous REST endpoint use case
Just to simplify the example, we have a REST endpoint that:
1. persists data in a database
1.a. if step 1 is successful: we queue up events on a queue to be processed by external services
1.b. if step 1 is unsuccessful: we halt processing, reject the request and throw relevant errors. The user can decide to retry transient issues.
We’re thinking of using temporal to ensure resiliency (via retries) and consistency (via eventual consistency) between the persist op + queuing. Though we require that the db persist op is strongly consistent (i.e the persist op must be performed before we return the response to the user whereas queuing the event can be done “later/eventually”)
Without temporal: The approach is to persist data to db (in 1 transaction) and have a background job polling for new changes and processing the queuing of events + updating the processed events to processed.
With temporal: We thought of the two approaches:
Approach 1:
We thought of keeping the database persist logic outside the temporal workflow. We first ensure that the data is persisted to db. If the persist op is successful, then we kick off a workflow to queue up the event.
drawback:
This seems to suffer from a similar problem as the initial case. Starting a workflow requires a network hop to Temporal server. This operation can fail after we had persisted data to db and as a result the event won’t be queued.
Approach 2:
Introducing a workflow with two activities:
a. activity1 performs persisting data to db
b. activity2 performs the queuing.
We can use the synchronous/blocking start mechanism to process the workflow within the REST endpoint processing.
Questions
- What’s a typical latency for workflow execution?
- Is there a way to ensure that only a specific activity is successful (i.e the db persist activity) within the REST processing. We’re thinking we can try and read the persist activity status, but wouldn’t that also require a network hop and can also potentially fail leaving us without the ability to figure out if the db op was successful or not by the time we send the REST response back. In essence we don’t know if we should send a success or failure response.
I feel like some of the challenges we have are a result of our requirement for strongly consistent db persist op within a synchronous REST endpoint with the ability to manage potential poison message (i.e operation that will never succeed) all in real-time.
I am not sure if there’s a recommended pattern for our seemingly simple use case.