Rate limit exceeded when replica count > 1

rvr · January 22, 2024, 2:49pm

Hello,

I am deploying Temporal via Helm Charts to Kubernetes (with Istio service mesh), and I ran into a strange problem. At first, all services deploy properly, they seem to communicate with each other using both membership ports and the regular ones for gRPC communication. But when I launch a test workflow to check if everything is running properly, the services become very “chatty”, and often I get a “rate limit exceeded” error from the history service when I try to view workflow details in the UI. I’m not running any load tests, it’s just a single workflow instance that seem to cause this. Also, I haven’t changed the default RPS settings.

After reading this, I thought I’ll try to reduce replica count for all services from 2 to 1. After this change, everything went back to normal, and the workflows started to execute properly, I can also view the details without any problem.

So the question is, what am I doing wrong? I haven’t touched settings like broadcastAddress, so it’s by default using the pod IP. Also I use default port numbers. Maybe it’s some routing or load balancing problem in my services definitions or istio configuration that causes some redundant loops in the network traffic?

I’m using server version 1.22.3.

Thank you!
Wojtek

Topic		Replies	Views
Service rate limit exceeded Community Support go-sdk , helm , metrics	16	2555	August 2, 2023
Frontend fails to poll for decision task Community Support	2	683	July 20, 2020
Temporal workload unable to talk to each other when STRCT MTLS enabled in Istio Community Support server	8	2498	January 17, 2024
Temporal performance issues Community Support java-sdk , performance , worker , kubernetes	1	1841	April 26, 2023
[SOLVED] "context deadline exceeded" & "Not enough hosts to serve requests" errors Community Support kubernetes	1	40385	March 31, 2022

Rate limit exceeded when replica count > 1

Related topics