Hi, error in upgrading temporal server image from 1.17.4 to 1.18.0/1.18.1

Hi,

We are upgrading the temporal server image from 1.17.4 to 1.18.0/1.18.1. We use ES v7 and have executed the v3 version upgrade script. We are using admin-tools 1.18.1 image. The pods are crashing with below error. Could you please advise if there is any other step we are missing.

ERROR		Failed to start: could not build arguments for function .(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/go/pkg/mod/go.temporal.io/server@v1.18.0/temporal/fx.go:916): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/go/pkg/mod/go.temporal.io/server@v1.18.0/temporal/fx.go:163): **cassandra schema version compatibility check failed**: no connections were made when creating the session

This is the root cause:

no connections were made when creating the session

It seems like Temporal Server can’t connect to Cassandra. Is it up and running? Is database configured properly in server config?

@Alex, we are using configmap to pass all the connections and its the same configmap we are using for 1.17.4 and it works as expected.

full error stack looks like this

[Fx] PROVIDE	fx.DotGraph <= go.uber.org/fx.(*App).dotGraph-fm()
[Fx] ERROR		Failed to initialize custom logger: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2
	/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:414:
failed to build fxevent.Logger:
could not build arguments for function "go.temporal.io/server/temporal".glob..func8
	/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:916:
failed to build log.Logger:
received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider
	/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:163:
cassandra schema version compatibility check failed: no connections were made when creating the session
[Fx] ERROR		Failed to start: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2
	/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:414:
failed to build fxevent.Logger:
could not build arguments for function "go.temporal.io/server/temporal".glob..func8
	/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:916:
failed to build log.Logger:
received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider
	/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:163:
cassandra schema version compatibility check failed: no connections were made when creating the session
Unable to start server. Error: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.18.2/app.go:414): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:916): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:163): cassandra schema version compatibility check failed: no connections were made when creating the session

The only difference is the image version, once we change the version back to 1.17.4 it works

Could you ssh into your docker container and see if the config is as expected?

The docker container isn’t starting. It is in error/crashing with the image 1.18.0, however the same deployment set-up works with image 1.17.4. We can see the config is as expected in the container with image 1.17.4.

Generated config look legit (at least from the 1st glance) and is similar to the one produced by the 1.17.4

local-local-01-primary-primary-service.temporal-state.svc.cluster.local - this is the real service name cause of many reasons… also Cassandra runs within the k8s cluster

persistence:
  defaultStore: default
  visibilityStore: visibility
  advancedVisibilityStore: es-visibility
  numHistoryShards: 512
  datastores:
    default:
      cassandra:
        hosts: "local-local-01-primary-primary-service.temporal-state.svc.cluster.local"
        port: 9042
        user: "cassandra-superuser"
        password: "some_random_password"
        connectTimeout: 2s
        consistency:
          default:
            consistency: local_quorum
            serialConsistency: local_serial
        datacenter: primary
        disableInitialHostLookup: true
        keyspace: temporal
        replicationFactor: 1
    visibility:
      cassandra:
        hosts: "local-local-01-primary-primary-service.temporal-state.svc.cluster.local"
        port: 9042
        user: "cassandra-superuser"
        password: "some_random_password"
        connectTimeout: 2s
        consistency:
          default:
            consistency: local_quorum
            serialConsistency: local_serial
        datacenter: primary
        disableInitialHostLookup: true
        keyspace: temporal_visibility
        replicationFactor: 1

Ok - this is the point of failure

When it is set to true 1.18.1 fails to connect which was not the case for the 1.17.4. As we run Cassandra within the same cluster such setting set to false actually works better for us as all nodes can be discovered where if we consume a public endpoint we may experience a problem.

Not sure if this is how it supposed to work but let us know if you need us to provide more data & run tests in case it helps fixing the issue for someone else.

At the moment we can proceed with the upgrade, thanks for coming back.

1 Like

The only related change in 1.18 is upgrade to use gocql v1.2.
Maybe this is caused by Fallback to `system.peers` when `system.peers_v2` is not available by mpenick · Pull Request #1646 · gocql/gocql · GitHub

We will look into this and see if we need to upgrade to gocql 1.2.1

1 Like

We have already upgrade to gocql v1.2.1 in our master.

Hi @Yimin_Chen We are facing the same issue but what we use is MySQL. What could be the recommended fix? Thanks!

temporal server version we are at is v1.18.0

@Maggie what is your error message (and call stack).

We are noticing the same with temporal v1.18.1 + mysql. My steps are:

  1. Clone the sample server repo: GitHub - temporalio/samples-server
  2. Change the dependency to temporal server v1.18.1 and the config to use MySQL
  3. make start-dependencies to start dependency in temporal repo
  4. make install-schema-mysql to bootstrap the mysql schema
  5. go run authorizer/server/main.go to start the server

The error stack is following:

[Fx] ERROR		Failed to initialize custom logger: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2
	/Users/tmu/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415:
failed to build fxevent.Logger:
could not build arguments for function "go.temporal.io/server/temporal".glob..func8
	/Users/tmu/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:916:
failed to build log.Logger:
received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider
	/Users/tmu/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:163:
sql schema version compatibility check failed: not supported plugin mysql, only supported: map[]
[Fx] ERROR		Failed to start: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2 (/Users/tmu/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/Users/tmu/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:916): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/Users/tmu/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:163): sql schema version compatibility check failed: not supported plugin mysql, only supported: map[]
2022/10/19 11:34:03 could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2 (/Users/tmu/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/Users/tmu/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:916): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/Users/tmu/go/pkg/mod/go.temporal.io/server@v1.18.1/temporal/fx.go:163): sql schema version compatibility check failed: not supported plugin mysql, only supported: map[]
exit status 1

Hey @Yimin_Chen I also opened a thread in slack (#support channel) and you are helping on there as well (the problem is not fixed yet). But for the record, the error logs I got is:

{"level":"warn","ts":"2022-10-19T23:44:17.647Z","msg":"Not using any authorizer and flag `--allow-no-auth` not detected. Future versions will require using the flag `--allow-no-auth` if you do not want to set an authorizer.","logging-call-at":"main.go:171"}
[mysql] 2022/10/19 23:44:17 packets.go:36: unexpected EOF
[mysql] 2022/10/19 23:44:17 packets.go:36: unexpected EOF
[mysql] 2022/10/19 23:44:17 packets.go:36: unexpected EOF
[Fx] PROVIDE	*pprof.PProfInitializerImpl <= go.temporal.io/server/common/pprof.NewInitializer()
[Fx] PROVIDE	*temporal.ServerImpl <= go.temporal.io/server/temporal.NewServerFxImpl()
[Fx] PROVIDE	temporal.Server <= go.temporal.io/server/temporal.glob..func9()
[Fx] SUPPLY	[]temporal.ServerOption
%!(EXTRA string=)[Fx] PROVIDE	*temporal.serverOptions <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	chan interface {} <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	*config.Config <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	*config.PProf <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	log.Config <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	resource.ServiceNames <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	resource.NamespaceLogger <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	resolver.ServiceResolver <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	client.AbstractDataStoreFactory <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	searchattribute.Mapper <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	[]grpc.UnaryServerInterceptor <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	authorization.Authorizer <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	authorization.ClaimMapper <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	authorization.JWTAudienceMapper <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	log.Logger <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	client.FactoryProvider <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	metrics.Client <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	dynamicconfig.Client <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	*dynamicconfig.Collection <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	encryption.TLSConfigProvider <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	*client.Config <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	client.Client <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	metrics.MetricsHandler <= go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] PROVIDE	[]trace.SpanExporter <= go.temporal.io/server/temporal.glob..func2()
[Fx] PROVIDE	client.FactoryProviderFn <= go.temporal.io/server/temporal.PersistenceFactoryProvider()
[Fx] PROVIDE	*temporal.ServicesMetadata[group = "services"] <= go.temporal.io/server/temporal.HistoryServiceProvider()
[Fx] PROVIDE	*temporal.ServicesMetadata[group = "services"] <= go.temporal.io/server/temporal.MatchingServiceProvider()
[Fx] PROVIDE	*temporal.ServicesMetadata[group = "services"] <= go.temporal.io/server/temporal.FrontendServiceProvider()
[Fx] PROVIDE	*temporal.ServicesMetadata[group = "services"] <= go.temporal.io/server/temporal.WorkerServiceProvider()
[Fx] PROVIDE	*cluster.Config <= go.temporal.io/server/temporal.ApplyClusterMetadataConfigProvider()
[Fx] PROVIDE	config.Persistence <= go.temporal.io/server/temporal.ApplyClusterMetadataConfigProvider()
[Fx] PROVIDE	fx.Lifecycle <= go.uber.org/fx.New.func1()
[Fx] PROVIDE	fx.Shutdowner <= go.uber.org/fx.(*App).shutdowner-fm()
[Fx] PROVIDE	fx.DotGraph <= go.uber.org/fx.(*App).dotGraph-fm()
[Fx] ERROR		Failed to initialize custom logger: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2
	/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415:
failed to build fxevent.Logger:
could not build arguments for function "go.temporal.io/server/temporal".glob..func8
	/home/builder/temporal/temporal/fx.go:916:
failed to build log.Logger:
received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider
	/home/builder/temporal/temporal/fx.go:163:
sql schema version compatibility check failed: driver: bad connection
[Fx] ERROR		Failed to start: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/home/builder/temporal/temporal/fx.go:916): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/home/builder/temporal/temporal/fx.go:163): sql schema version compatibility check failed: driver: bad connection
Unable to start server. Error: could not build arguments for function "go.uber.org/fx".(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/home/builder/temporal/temporal/fx.go:916): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/home/builder/temporal/temporal/fx.go:163): sql schema version compatibility check failed: driver: bad connection

Is that the same visibility DB got deleted issue?

right. Issue is on our end and not related with main thread of this post. Sorry for being distractive here.

@tmu you need to import that MySQL plugin to your main.go, like this.

1 Like

Just tested the latest 1.18.3 with the gocql 1.2.1 and it seem to be working with and without the lookup so all is good now.

Thanks,
A