I want to use temporal workers as runners for my multi-tenant platform

aravindarc · May 23, 2022, 12:00pm

I have a multi-tenant system, where each tenant will have multiple runners, I need to run some tasks on the runners. I am using temporal for this purpose, I am creating a task queue for every tenant.

But how do I authenticate the worker with temporal server, can I first expose the server publicly and let workers connect directly using some kind of authentication, this authentication has to be specific for a tenant. One key-pair for every runner. Where key can be revoked.

tihomir · May 23, 2022, 4:01pm

How many tenants do you expect to have?
Temporal provides pluggable Authorizer and ClaimMapper components with defaults based on JWT.
See more info in this post. With this you could set up specific rules based on user roles and namespaces you provide to your tenants.

aravindarc · May 24, 2022, 9:16am

Thanks for the reply, I am expecting to have around 50 tenants max, I went through this post, if I am going to use this Authorizer and ClaimMapper,

Will I have to build my own docker image with the custom components
How will internal services access the temporal server, like my own workers, can they be exempted from the authentication and authorization logic
Can I rewrite this Authorizer and ClaimMapper to use other mechanisms instead of JWT
Is there an upper limit to the number of workers, is there any overhead in using multiple namespaces

tihomir · May 24, 2022, 11:25pm

(1) You mean if you define custom Authorizer and ClaimMapper? If so then yes, you would need to.

(2) ClaimsMapper can translate caller identity (from TLS and/or Auth Token).

(3) Yes, tls, see here.

(4) For namespaces, Temporal server does not enforce a max but total number will depend on your cluster capacity.
For workers, you typically want to start with single worker process and saturate it. See worker tuning guide.

There is a couple of key metrics that you can utilize in order to know if you need to increase your worker capacity:

a) Sync match rate (server metrics)
Useful Prometheus query:
sum(rate(poll_success_sync{}[1m])) / sum(rate(poll_success{}[1m]))

poll_success_sync measure only async matched tasks.
poll_success measures how many total tasks are delivered.
Ideally you want sync match rate to be above 99%. If it’s too low it can mean that your workers are unable to keep up and you should consider increasing your worker capacity.

b) task_schedule_to_start_latency (Server metric) , temporal_workflow_task_schedule_to_start_latency (SDK metric)

Measures latency between when a task is scheduled and delivered. If this latency is high its a strong indication to increase your worker capacity (add more workers).

c) asyncmatch_latency

Measures async matched tasks from when this task is created to when it’s delivered, including the time the task is sitting in the task queue. Large latencies can indicate that your workers are unable to pick up tasks fast enough and can be an indication to increase worker capacity.

aravindarc · May 25, 2022, 9:12am

Thank you @tihomir, for the quick reply. It really helped.

Igor_Polynets1 · August 1, 2024, 3:25am

Does this feature available on temporal cloud?

Topic		Replies	Views
Authorization, Untrusted workflow creation and namespace authentication Community Support auth	21	5018	July 29, 2024
How can I support my use case for multi-tenant system? Community Support multi-tenant	7	2369	January 16, 2024
Required Claim Mapper Permissions for Worker Service Server Deployment auth	3	108	May 30, 2025
Extending the custom authorisation (JWT) to the System Worker Community Support jwt , auth , worker	8	2258	December 28, 2021
Temporal worker authorization roles Community Support python-sdk , helm	2	32	May 23, 2025

I want to use temporal workers as runners for my multi-tenant platform

Related topics