Issue in Service Pending Requests metrics from Frontend service in K8S deployment

Hi,

We have a K8S deployment with Front end - 3 pods, 4 cores 4GB each ; History - 5 pods, 8 cores 8GB each; Matching - 3 pods, 4 cores 4GB each. When we scale down and then scale up frontend pods back to 3 pods, observed that service_pending_requests metric was being emitted only from a single pod even when all the pods were in running state, which we believe in turn was degrading our response times during load testing since same load was doing good before scale down activity of frontend pods.
Later, we scaled down back to 0 and scaled gradually one after another, and this time the metric was being emitted from all pods.
Need help, unable to point out the reason behind this behavior…

Also, is there any recommendation to verify for readiness of all the temporal services during deployment?

Do you have ingress/load balancing for your frontend pods? Maybe this old post is still relevant?

Is there any recommendation to verify for readiness of all the temporal services during deployment

What version of k8s are you using? I believe since 1.2.4 you no longer would need grpc-health-probe to health check each service and it supports grpc directly for liveness/readiness (see here for more info)

Thanks @tihomir . I will refer the above links and check on the same.