Lowest achievable latency

I’m evaluating the usability of temporal for our application, and I’m interested in performance aspects especially added latency. From other posts, such as Low latency - Stateless Workflow, I understood that I should expect tens of milliseconds between steps. Is this correct assumption?
In a basic workflow (1 activity only logging), local-dev, go sdk, I’m seeing a end-start=200ms. Is it the expected order of magnitude or should I investigate my setup?

Is some documentation available on that topic, on top of Developer's guide - Worker performance | Temporal Documentation? Any recommendation for the production setup (is ElasticSearch recommended)?


I’m seeing a end-start=200ms

How are you measuring these latencies? SDK metrics (workflow_endtoend_latency metric maybe)?


Do you by chance run load tests with maru?

I think latencies will highly depend on your cluster setup, persistence latencies, network latencies as well as your worker setup and your workflow/activity code (your use case and how you implement it), so there are many moving parts.

Typically you would run load test and then we can look at sdk and server metrics along side your pod cpu and mem utilizations (on both your workers and service hosts) and try to make improvements.

Any recommendation for the production setup (is ElasticSearch recommended)?

With server version 1.20.0 you can also set up advanced visibility with SQL persistence. For large scale deployments using ES might still be useful over sql persistence given your db size tho.

Also take a look at some optimization you can make on the application side to reduce the E2E latency: Temporal Community Meetup February, 2023 - YouTube