Hello,
We’ve been trying to set up a Temporal cluster in AWS using ECS/Fargate. It has been a major struggle to say the least. We’ve come far though.
Currently, we’ve got most of the containers (admin, UI, frontend, history, matching) running and passing health checks. Most of the containers are behind an ALB that uses GRPC health checks for the containers. The ALB terminates TLS. The worker is unique since it doesn’t have a TCP listener at all. We baked nginx into the worker container to act as a dummy server to pass health checks (still broken).
Despite the containers running and printing out logs saying they’re connected and okay, I can’t access the cluster via the UI or tctl. Both return the same error: Object { statusCode: 503, statusText: "Service Unavailable", response: Response, message: 'last connection error: connection error: desc = "error reading server preface: http2: frame too large"', report: false }
The UI container itself doesn’t have any more verbosity in its logs.
From what I’ve been able to see, many people had to switch the frontend container to use an NLB instead of an ALB. I’ve switched that, but I’m still getting this error and a 503. I’ve been at this for months. Can someone share their setup or some pointers? Does everything need to be behind an NLB?