So, I think I have finally figured out the correct EKS ingress annotation configurations to correctly configure the ALB ingress’s default rule to just forward all requests to the target group instead of sending a 404 to all requests that don’t match the host/path. This change finally allows allows ip-based grpc requests to actually make it to the ingress/kube layer.
Now that I’m sure all requests are making it to the kube layer, I am pretty sure that temporal is using IPs (without an Authority header) when connecting to the rpcAddress configured in the clusterInformation, even when a dns name is specified. Using IPs instead of hostnames works fine for cluster-local resolution as the rpcAddress is usually the service name which will ultimately resolve to the cluster-local service IP and traffic will route correctly as it does not rely on the host name at all.
However, when a kubernetes ingress is involved (i.e. cluster in east1 connecting to cluster in east2), using an IP to connect to the other cluster (without providing an authority header) causes the ingress rules in the kube layer to not route the request to the service as it will not be able to match the request to any of the hosts specified in the ingress rule (and an ingress rule host record cannot have an IP address).
I believe I have replicated what the pods are seeing (i/o timeout) in a separately deployed pod I’ve deployed with the grpcurl cli tool that I am using to test gRPC connectivity to the other cluster’s ingress. What I’ve found using the grpcurl tool in this pod is that when I try to connect to the ingress using just IP address:
grpcurl --insecure x.x.23.144:443 list
I get a connection timeout
Failed to dial target host "x.x.23.144:433": context deadline exceeded
However, if I provide an Authority header when connecting using the ip I get the same successful response as I would get when I use the hostname :
## From an east1 cluster pod
bash-5.1$ grpcurl --insecure -authority myfrontend.temporal-eks-east2.example.com x.x.23.144:443 list
Still not sure what the resolution is though.