Temporal for synchronous API

Hi,

We are evaluating Temporal to create a synchronous API. The API needs to orchestrate 30 downstream services each with a latency of approximately 200 ms. We need to achieve an SLA of 10 seconds. Can we achieve this using Temporal given that the downstream service executions alone will take 6 seconds?

When some node goes down, how much time does Temporal take to detect it and resume the execution in a different node?

Can we use Azure CosmosDB as the persistence layer since it has a Cassandra API?

Do we have any page where we can see the upcoming features of Temporal?

Regards,
Amlan

Welcome to the Temporal community!

Can we achieve this using Temporal given that the downstream service executions alone will take 6 seconds?

I believe it is possible to meet your latency requirements. If you can parallelize some calls to the downstream systems then it speeds things up even more. What is the expected system behavior if one of those downstream services is down?

When some node goes down, how much time does Temporal take to detect it and resume the execution in a different node?

There are a couple of timeouts that default to 5 and 10 seconds and are triggered in different failure scenarios. In your case, I would tune them down to meet your perf requirements.

Can we use Azure CosmosDB as the persistence layer since it has a Cassandra API?

It can in theory. But someone has to write the persistence binding component to CosmosDB.

Do we have any page where we can see the upcoming features of Temporal?

I don’t think we have such a page. We post major updates to Announcements - Temporal. You can also join our Slack where we announce individual releases.

Is any specific feature are you interested in?

1 Like

Thanks, Maxim. I had a few follow up questions.

We are trying to create an HTTP API to do content ingestion. The SLA we are aiming at is 10 seconds. This may not qualify as a long-running workflow. Still, we are thinking of using a stateful orchestration because approximately 20 microservices participate in the ingestion process and we want them to be consistent even when some process fails during the ingestion.

Do we have production deployments that use Temporal/Cadence for this kind of requirement or Temporal is recommended for long-running workflows that take minutes or hours to complete?

Are there any metrics or performance numbers that can give some idea about how much overhead is added by Temporal to maintain the states?

If I understood it correctly, it takes 5 or 10 seconds to detect a node failure depending on the failure type, but these numbers are configurable. How low can we go with these numbers given that our SLA is 10 seconds? What kind of problems can we face if they are low (like detecting false positives)?

There are production deployments that target such latencies. But it is a pretty tight SLA mostly due to sheer number of API calls you are making. For example, are you going to retry any failed calls? If you do then you can easily blow up your 10 second budget just on a few retries.

I propose a zoom meeting to go over your use case in more details. It is hard to give a generic recommendation if Temporal is a good fit without more data. PM me to set it up.