RESOURCE_EXHAUSTED: namespace count limit exceeded


Getting error “RESOURCE_EXHAUSTED: namespace count limit exceeded” in staging env but not in prod env. The value for maxConcurrentWorkflowTaskPollers is defined as 80 and maxConcurrentActivityTaskPollers is also defined as 80. And the frontend.namespaceCount is set s 20000. And frontend has 5 instances.

However, when checking the number of polls (from service_pending_requests metrics) based on this post, we find the service_pending_requests is around 300, which is far lower than 20000 we set.

Can someone please explain what causes “too many concurrent polls for workflow/activity tasks” in this situation? Thanks

frontend.namespaceCount is per frontend host, do you have it set on each of the 5? (assume you have ingress / LB in front of them, maybe could check if its maybe routing all calls to single one?).

Also look at resource exhausted server metric:
sum(rate(service_errors_resource_exhausted{}[1m])) by (resource_exhausted_cause)
and check the causes.

In WorkerOption do you set MaxConcurrentWorkflowTaskPollers, MaxConcurrentActivityTaskPollers? How many workers do you have?