Temporal worker not able to connect to internal frontend

Temporal server version v1.25.0. we are trying to setup internal-frontend and running into issues where worker service isn’t able to connect to internal-frontend after the PublicClient is removed. Anyone experienced same issue? The communication works fine if we don’t start internal-frontend service and use PublicClient for connecting to Frontend. We are running on Kubernetes with each service its own pod (frontend, internal-frontend, history, matching, worker)

We keep getting this error -
{“level”:“warn”,“ts”:“2025-02-26T23:10:34.802Z”,“msg”:“error creating sdk client”,“service”:“worker”,“error”:“failed reaching server: received context error while waiting for new LB policy update: context deadline exceeded”,“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/sdk/factory.go:123”}
{“level”:“error”,“ts”:“2025-02-26T23:10:38.664Z”,“msg”:“start failed”,“component”:“fx”,“error”:“context deadline exceeded”,“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/temporal/fx.go:1130”,“stacktrace”:“go.temporal.io/server/common/log.(*zapLogger).Error\n\t/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/log/zap_logger.go:155\ngo.temporal.io/server/temporal.(*fxLogAdapter).LogEvent\n\t/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/temporal/fx.go:1130\ngo.uber.org/fx.(*App).Start.func1\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/app.go:664\ngo.uber.org/fx.(*App).Start\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/app.go:672\ngo.temporal.io/server/temporal.(*ServerImpl).startServices\n\t/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/temporal/server_impl.go:155\ngo.temporal.io/server/temporal.(*ServerImpl).Start\n\t/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/temporal/server_impl.go:121\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).runStartHook\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/internal/lifecycle/lifecycle.go:256\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).Start\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/internal/lifecycle/lifecycle.go:216\ngo.uber.org/fx.(*App).start-fm.(*App).start.func1\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/app.go:704\ngo.uber.org/fx.(*App).withRollback\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/app.go:686\ngo.uber.org/fx.(*App).start\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/app.go:703\ngo.uber.org/fx.withTimeout.func1\n\t/jenkins/pkg/mod/go.uber.org/fx@v1.22.0/app.go:802”}

can you look at your static config on worker pod (/etc/temporal/config/docker.yaml)?
see if it has internal-frontend section defined, see here

That is correct we have it in the worker config -
internal-frontend:
rpc:
grpcPort: 7236
membershipPort: 6936
BindOnIP: 0.0.0.0

adding more worker logs where it does suggest internal frontend is one of the “current reachable members”.
{“level”:“info”,“ts”:“2025-02-27T15:23:06.442Z”,“msg”:“Current reachable members”,“component”:“service-resolver”,“service”:“worker”,“addresses”:[“100.127.51.62:7239”],“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/membership/ringpop/service_resolver.go:331”}
{“level”:“info”,“ts”:“2025-02-27T15:23:06.442Z”,“msg”:“Current reachable members”,“component”:“service-resolver”,“service”:“frontend”,“addresses”:[“100.127.63.53:7233”],“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/membership/ringpop/service_resolver.go:331”}
{“level”:“info”,“ts”:“2025-02-27T15:23:06.442Z”,“msg”:“Current reachable members”,“component”:“service-resolver”,“service”:“internal-frontend”,“addresses”:[“100.127.11.157:7236”],“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/membership/ringpop/service_resolver.go:331”}
{“level”:“info”,“ts”:“2025-02-27T15:23:06.442Z”,“msg”:“Current reachable members”,“component”:“service-resolver”,“service”:“matching”,“addresses”:[“100.127.67.228:7235”],“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/membership/ringpop/service_resolver.go:331”}
{“level”:“warn”,“ts”:“2025-02-27T15:23:11.450Z”,“msg”:“error creating sdk client”,“service”:“worker”,“error”:“failed reaching server: received context error while waiting for new LB policy update: context deadline exceeded”,“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/sdk/factory.go:123”}
{“level”:“warn”,“ts”:“2025-02-27T15:23:16.615Z”,“msg”:“error creating sdk client”,“service”:“worker”,“error”:“failed reaching server: received context error while waiting for new LB policy update: context deadline exceeded”,“logging-call-at”:“/jenkins/pkg/mod/go.temporal.io/server@v1.25.0/common/sdk/factory.go:123”}

Turns out we were missing two things -

  1. Expose internal frontend ports
  2. define these ports in egress configuration for the namespace.
    Its working now. Thanks