Worker ID Uniqueness

karney · February 26, 2024, 8:10pm

In our infrastructure, PIDs and hostnames are static across workers - by default the worker processes end up all having the same PIDs and report the same hostname. In order to make the workers show up properly in the temporal UI and to attribute which worker worked on the task, we’ve had to ensure that we override how the various SDKs derive the worker-id. With the worker-id override, the # of workers properly show up and task attribution works as expected.

I haven’t dug into the temporal cluster’s task matching code, but I’m curious as to type of problems arise when worker-id is not unique across workers. We recently saw a lot of workflows time out right after the initial WorkflowTaskScheduled task. On napkin, I’ve calculated there should be sufficient worker threads vs # of running executions so they shouldn’t have timed out - they timed out before WorkflowTaskStarted so this tells me the task poller never received the workflow task. My intuition tells me the cluster gets confused if multiple workers use the same worker-id. Instead of n workers, it might thinks there’s only 1/n of that number available. But given that workers poll, I would have also expected the cluster’s task queue continue to dispatch workflow tasks as long as pollers were ready to take a task off the task queue.

I’m interested in the internals to better understand what other negative impact there might be beyond throughput. In particular, could there be wasted effort / negative impact on the persistence layer? We did see a large load on the DB but it’s difficult to determine if this was because of the worker-id non-uniqueness.

maxim · February 26, 2024, 10:06pm

I belive worker-id is used only for visibility purposes and doesn’t affect how tasks are delivered. So overriding them doesn’t affect performance.

Topic		Replies	Views
Is worker threadId available in event-history? Community Support	2	464	January 11, 2021
How to prevent multiple workers from executing the same workflow with same workflow id Community Support	12	1714	September 27, 2022
Number of initialised workers vs actual workers are different in temporal cloud Community Support python-sdk , general-impl , activity , web-ui	1	24	March 25, 2025
Questions on temporal workers Community Support	12	678	March 11, 2024
Activity duplicated in according to worker logs Community Support java-sdk	1	40	July 20, 2024

Worker ID Uniqueness

Related topics