Error when enable Multi-cluster replication + TLS

Hi experts,

TLDR
Do we support Multi-cluster replication with TLS enabled (frontend)? (Temporal v1.19.1)

More Context
I am trying to test Multi-cluster replication with TLS enabled but got following errors when run the command tctl -address localhost:7233 admin cluster upsert-remote-cluster --frontend_address "localhost:8233"

{"level":"error","ts":"2023-04-21T10:54:01.669-0700","msg":"service failures","operation":"AddOrUpdateRemoteCluster","error":"last connection error: connection closed before server preface received","logging-call-at":"telemetry.go:280","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/runner/work/temporal/temporal/common/log/zap_logger.go:144\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).handleError\n\t/home/runner/work/temporal/temporal/common/rpc/interceptor/telemetry.go:280\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept\n\t/home/runner/work/temporal/temporal/common/rpc/interceptor/telemetry.go:151\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165\ngo.temporal.io/server/common/metrics.NewServerMetricsContextInjectorInterceptor.func1\n\t/home/runner/work/temporal/temporal/common/metrics/grpc.go:66\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165\ngo.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.UnaryServerInterceptor.func1\n\t/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.36.1/interceptor.go:352\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165\ngo.temporal.io/server/common/rpc/interceptor.(*NamespaceLogInterceptor).Intercept\n\t/home/runner/work/temporal/temporal/common/rpc/interceptor/namespace_logger.go:84\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165\ngo.temporal.io/server/common/rpc/interceptor.(*NamespaceValidatorInterceptor).LengthValidationIntercept\n\t/home/runner/work/temporal/temporal/common/rpc/interceptor/namespace_validator.go:103\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor\n\t/home/runner/work/temporal/temporal/common/rpc/grpc.go:137\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1167\ngo.temporal.io/server/api/adminservice/v1._AdminService_AddOrUpdateRemoteCluster_Handler\n\t/home/runner/work/temporal/temporal/api/adminservice/v1/service.pb.go:930\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1340\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1713\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/home/runner/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:965"}

However, the error goes away when I disable the TLS config for frontend service like following:

global:
  tls:
    internode:
      server:
        certFile: {{path to my cert}}
        keyFile: {{path to my key}}
        requireClientAuth: true
        clientCaFiles:
          - {{path to my ca files}}
      client:
        serverName: myhost
        rootCaFiles:
          - {{path to my ca files}}

//    frontend:
//      server:
//        certFile: {{path to my cert}}
//        keyFile: {{path to my key}}
//        requireClientAuth: true
//        clientCaFiles:
//          -  {{path to my ca files}}
//      client:
//        serverName: myhost
//        rootCaFiles:
//          -  {{path to my ca files}}

Thanks in advance!

connection closed before server preface received

Any other logs that jump out by chance? There is number of things that could cause this, one of them is using expired certs, so would check that for sure.

@tihomir Thanks for the pointers! There are some info lines before this error line I am not sure if they have more insights:

{"level":"info","ts":"2023-04-21T10:53:58.731-0700","msg":"admin client encountered error","service":"frontend","error":"last connection error: connection closed before server preface received","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:87"}
{"level":"info","ts":"2023-04-21T10:53:58.921-0700","msg":"admin client encountered error","service":"frontend","error":"last connection error: connection closed before server preface received","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:87"}
{"level":"info","ts":"2023-04-21T10:53:59.270-0700","msg":"admin client encountered error","service":"frontend","error":"last connection error: connection closed before server preface received","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:87"}
{"level":"info","ts":"2023-04-21T10:53:59.921-0700","msg":"admin client encountered error","service":"frontend","error":"last connection error: connection closed before server preface received","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:87"}
{"level":"info","ts":"2023-04-21T10:54:00.762-0700","msg":"admin client encountered error","service":"frontend","error":"last connection error: connection closed before server preface received","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:87"}
{"level":"info","ts":"2023-04-21T10:54:01.669-0700","msg":"admin client encountered error","service":"frontend","error":"last connection error: connection closed before server preface received","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:87"}

Also the certs should be good to use, I tested the certs with other commands like following, they all work well:

  • tctl cl h
  • tctl admin cluster describe

Hello! Were you ever able to figure this out? I have also been trying to get replication working with mtls, but with no luck.