Temporal services deadline exeeded

Hello, I am trying to start the temporal services and getting this error


2023/05/29 19:02:53 Loading config; env=development,zone=,configDir=config
2023/05/29 19:02:53 Loading config files=[config/development.yaml]
{"level":"info","ts":"2023-05-29T19:02:53.869Z","msg":"Build info.","git-time":"2023-05-15T23:50:55.000Z","git-revision":"45d22540323e59e4cd3fd62139b73409f1264fb3","git-modified":true,"go-arch":"amd64","go-os":"linux","go-version":"go1.20.4","cgo-enabled":false,"server-version":"1.20.3","debug-mode":false,"logging-call-at":"main.go:143"}
{"level":"info","ts":"2023-05-29T19:02:53.869Z","msg":"Dynamic config client is not configured. Using noop client.","logging-call-at":"main.go:163"}
{"level":"warn","ts":"2023-05-29T19:02:53.869Z","msg":"Not using any authorizer and flag `--allow-no-auth` not detected. Future versions will require using the flag `--allow-no-auth` if you do not want to set an authorizer.","logging-call-at":"main.go:173"}
{"level":"info","ts":"2023-05-29T19:02:53.908Z","msg":"Use rpc address 127.0.0.1:7233 for cluster active.","component":"metadata-initializer","logging-call-at":"fx.go:852"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"Created gRPC listener","service":"history","address":"0.0.0.0:7234","logging-call-at":"rpc.go:152"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"Service is not requested, skipping initialization.","service":"matching","logging-call-at":"fx.go:402"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"Service is not requested, skipping initialization.","service":"frontend","logging-call-at":"fx.go:473"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"Service is not requested, skipping initialization.","service":"internal-frontend","logging-call-at":"fx.go:473"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"Service is not requested, skipping initialization.","service":"worker","logging-call-at":"fx.go:552"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"PProf not started due to port not set","logging-call-at":"pprof.go:67"}
{"level":"info","ts":"2023-05-29T19:02:53.932Z","msg":"Starting server for services","value":{"history":{}},"logging-call-at":"server_impl.go:97"}
{"level":"info","ts":"2023-05-29T19:02:53.950Z","msg":"Membership heartbeat upserted successfully","service":"history","address":"10.10.80.4","port":6934,"hostId":"6a67edd8-fe53-11ed-a15c-4a138b77ecd3","logging-call-at":"rpMonitor.go:255"}
{"level":"info","ts":"2023-05-29T19:02:53.951Z","msg":"bootstrap hosts fetched","service":"history","bootstrap-hostports":"127.0.0.1:6939,10.10.80.4:6934,127.0.0.1:6933","logging-call-at":"rpMonitor.go:297"}

{"level":"error","ts":"2023-05-29T19:03:28.977Z","msg":"unable to bootstrap ringpop. retrying","service":"history","error":"join duration of 35.025555441s exceeded max 30s","logging-call-at":"ringpop.go:109","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/common/membership.(*RingPop).bootstrap.func1\n\t/home/builder/temporal/common/membership/ringpop.go:109\ngo.temporal.io/server/common/backoff.ThrottleRetry.func1\n\t/home/builder/temporal/common/backoff/retry.go:175\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:199\ngo.temporal.io/server/common/backoff.ThrottleRetry\n\t/home/builder/temporal/common/backoff/retry.go:176\ngo.temporal.io/server/common/membership.(*RingPop).bootstrap\n\t/home/builder/temporal/common/membership/ringpop.go:114\ngo.temporal.io/server/common/membership.(*RingPop).Start\n\t/home/builder/temporal/common/membership/ringpop.go:84\ngo.temporal.io/server/common/membership.(*ringpopMonitor).Start\n\t/home/builder/temporal/common/membership/rpMonitor.go:138\ngo.temporal.io/server/common/resource.MembershipMonitorProvider.func1\n\t/home/builder/temporal/common/resource/fx.go:287\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).runStartHook\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/internal/lifecycle/lifecycle.go:130\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).Start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/internal/lifecycle/lifecycle.go:95\ngo.uber.org/fx.(*App).start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:679\ngo.uber.org/fx.withTimeout.func1\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:784"}


{"level":"info","ts":"2023-05-29T19:03:37.654Z","msg":"bootstrap hosts fetched","service":"history","bootstrap-hostports":"127.0.0.1:6939,10.10.80.4:6934,127.0.0.1:6933","logging-call-at":"rpMonitor.go:297"}

{"level":"error","ts":"2023-05-29T19:03:53.948Z","msg":"start failed","component":"fx","error":"OnStart hook added by go.temporal.io/server/common/resource.MembershipMonitorProvider failed: context deadline exceeded","logging-call-at":"fx.go:1141","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/temporal.(*fxLogAdapter).LogEvent\n\t/home/builder/temporal/temporal/fx.go:1141\ngo.uber.org/fx.(*App).Start.func1\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:662\ngo.uber.org/fx.(*App).Start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:670\ngo.temporal.io/server/temporal.(*ServerImpl).startServices.func1\n\t/home/builder/temporal/temporal/server_impl.go:158"}
{"level":"error","ts":"2023-05-29T19:03:53.948Z","msg":"OnStart hook failed","component":"fx","callee":"go.temporal.io/server/temporal.ServerLifetimeHooks.func1()","caller":"go.temporal.io/server/temporal.ServerLifetimeHooks","error":"failed to start service history: OnStart hook added by go.temporal.io/server/common/resource.MembershipMonitorProvider failed: context deadline exceeded","logging-call-at":"fx.go:1043","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/temporal.(*fxLogAdapter).LogEvent\n\t/home/builder/temporal/temporal/fx.go:1043\ngo.uber.org/fx.appLogger.LogEvent\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:837\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).runStartHook.func1\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/internal/lifecycle/lifecycle.go:121\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).runStartHook\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/internal/lifecycle/lifecycle.go:131\ngo.uber.org/fx/internal/lifecycle.(*Lifecycle).Start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/internal/lifecycle/lifecycle.go:95\ngo.uber.org/fx.(*App).start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:679\ngo.uber.org/fx.withTimeout.func1\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:784"}
{"level":"error","ts":"2023-05-29T19:03:53.948Z","msg":"start failed, rolling back","component":"fx","error":"failed to start service history: OnStart hook added by go.temporal.io/server/common/resource.MembershipMonitorProvider failed: context deadline exceeded","logging-call-at":"fx.go:1134","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/temporal.(*fxLogAdapter).LogEvent\n\t/home/builder/temporal/temporal/fx.go:1134\ngo.uber.org/fx.(*App).start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:681\ngo.uber.org/fx.withTimeout.func1\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:784"}
{"level":"error","ts":"2023-05-29T19:03:53.948Z","msg":"start failed","component":"fx","error":"failed to start service history: OnStart hook added by go.temporal.io/server/common/resource.MembershipMonitorProvider failed: context deadline exceeded","logging-call-at":"fx.go:1141","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:150\ngo.temporal.io/server/temporal.(*fxLogAdapter).LogEvent\n\t/home/builder/temporal/temporal/fx.go:1141\ngo.uber.org/fx.(*App).Start.func1\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:662\ngo.uber.org/fx.(*App).Start\n\t/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:670\ngo.temporal.io/server/temporal.ServerFx.Start\n\t/home/builder/temporal/temporal/fx.go:280\nmain.buildCLI.func2\n\t/home/builder/temporal/cmd/server/main.go:199\ngithub.com/urfave/cli/v2.(*Command).Run\n\t/go/pkg/mod/github.com/urfave/cli/v2@v2.4.0/command.go:163\ngithub.com/urfave/cli/v2.(*App).RunContext\n\t/go/pkg/mod/github.com/urfave/cli/v2@v2.4.0/app.go:313\ngithub.com/urfave/cli/v2.(*App).Run\n\t/go/pkg/mod/github.com/urfave/cli/v2@v2.4.0/app.go:224\nmain.main\n\t/home/builder/temporal/cmd/server/main.go:54\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
Unable to start server. Error: failed to start service history: OnStart hook added by go.temporal.io/server/common/resource.MembershipMonitorProvider failed: context deadline exceeded

Here is the configuration that I am using

global:
  membership:
    maxJoinDuration: 30s
    broadcastAddress: __POD_IP__


clusterMetadata:
  enableGlobalNamespace: false
  failoverVersionIncrement: 11
  masterClusterName: "active"
  currentClusterName: "active"
  clusterInformation:
    active:
      enabled: true
      initialFailoverVersion: 2
      rpcAddress: "127.0.0.1:7233"

services:
  history:
    rpc:
      grpcPort: 7234
      membershipPort: 6934
      bindOnIP: "0.0.0.0"

  matching:
    rpc:
      grpcPort: 7235
      membershipPort: 6935
      bindOnIP: "0.0.0.0"

  frontend:
    rpc:
      grpcPort: 7233
      membershipPort: 6933
      bindOnIP: "0.0.0.0"

  worker:
    rpc:
      grpcPort: 7239
      membershipPort: 6939
      bindOnIP: "0.0.0.0"

I have tried changing the permission from restrictive to permissive in istio to see if that could be problem but it did not fix the problem

broadcastAddress: POD_IP

Is this ip reachable by other hosts in the same cluster?

I was able to do telnet from one pod to another one.

unable to bootstrap ringpop. retrying",“service”:“frontend”,“error”:“join duration of 33.577760748s exceeded max 30s”

Typically I have seen this when misconfiguring TEMPORAL_BROADCAST_ADDRESS, but would also check membership ports. It seems from ringpop side it cannot reach all members.

Would also look at this forum post, specifically last comment if that helps.

what should be the value of TEMPORAL_BROADCAST_ADDRESS ?

Is IP that should be reachable by all other hosts in cluster (used for communication). You should specify it if its different than your BIND_ON_IP address (so for example if you set your bind_on_ip to 0.0.0.0).