Temporal deployment to minikube using helm-chart minimal install instruction isn't working

I’m trying to install temporal on my local minikube instance and am following instructions found here.
However, after running the helm command majority of the pods are stuck in “CrashLoopBackoff” state.

temporal % kubectl get pods
NAME                                       READY   STATUS             RESTARTS        AGE
temporaltest-admintools-7c548d4bff-n2z7c   1/1     Running            0               102m
temporaltest-cassandra-0                   1/1     Running            0               102m
temporaltest-frontend-7dc6d9f57d-cn85j     0/1     CrashLoopBackOff   24 (77s ago)    102m
temporaltest-history-5878f9c7ff-6w9gr      0/1     CrashLoopBackOff   24 (105s ago)   102m
temporaltest-matching-d8fc5c979-mpdpd      0/1     CrashLoopBackOff   24 (118s ago)   102m
temporaltest-web-74769bc548-wsm8v          1/1     Running            0               102m
temporaltest-worker-57b8589456-p9rls       0/1     CrashLoopBackOff   24 (107s ago)   102m

Looking at the one of the pods in detail

kubectl describe pod temporaltest-matching-d8fc5c979-mpdpd
Name:             temporaltest-matching-d8fc5c979-mpdpd
Namespace:        default
Priority:         0
Service Account:  default
Node:             minikube/192.168.49.2
Start Time:       Mon, 13 May 2024 14:16:11 -0700
Labels:           app.kubernetes.io/component=matching
                  app.kubernetes.io/instance=temporaltest
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=temporal
                  app.kubernetes.io/part-of=temporal
                  app.kubernetes.io/version=1.23.1
                  helm.sh/chart=temporal-0.37.0
                  pod-template-hash=d8fc5c979
Annotations:      checksum/config: f992d48f22ca06e2975c4ca6962bdf662f2c1f2c521fbb4da38e0f3bda5d90c9
                  prometheus.io/job: temporal-matching
                  prometheus.io/port: 9090
                  prometheus.io/scrape: true
Status:           Running
IP:               10.244.0.109
IPs:
  IP:           10.244.0.109
Controlled By:  ReplicaSet/temporaltest-matching-d8fc5c979
Init Containers:
  check-cassandra-service:
    Container ID:  docker://0b9ddff3a8038d72c4927c20ff21c32bdc7000683aaefbf12ba97baf159d63af
    Image:         busybox
    Image ID:      docker-pullable://busybox@sha256:5eef5ed34e1e1ff0a4ae850395cbf665c4de6b4b83a32a0bc7bcb998e24e7bbb
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      until nslookup temporaltest-cassandra.default.svc.cluster.local; do echo waiting for cassandra service; sleep 1; done;
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 13 May 2024 14:16:17 -0700
      Finished:     Mon, 13 May 2024 14:18:17 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2xzn5 (ro)
  check-cassandra:
    Container ID:  docker://b76f667f5e1dac1a727e1312e424ed8a8915026fd475d6c5ca710a0c21703f6d
    Image:         cassandra:3.11.3
    Image ID:      docker-pullable://cassandra@sha256:ce85468c5badfa2e0a04ae6825eee9421b42d9b12d1a781c0dd154f70d1ca288
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      until cqlsh temporaltest-cassandra.default.svc.cluster.local 9042 -e "SHOW VERSION"; do echo waiting for cassandra to start; sleep 1; done;
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 13 May 2024 14:18:18 -0700
      Finished:     Mon, 13 May 2024 14:18:18 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2xzn5 (ro)
  check-cassandra-temporal-schema:
    Container ID:  docker://1cf1568c4e19f23632a28fda953ecf1a5148f8b8000337c5990f6e542047e8b1
    Image:         cassandra:3.11.3
    Image ID:      docker-pullable://cassandra@sha256:ce85468c5badfa2e0a04ae6825eee9421b42d9b12d1a781c0dd154f70d1ca288
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      until cqlsh temporaltest-cassandra.default.svc.cluster.local 9042 -e "SELECT keyspace_name FROM system_schema.keyspaces" | grep temporal$; do echo waiting for default keyspace to become ready; sleep 1; done;
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 13 May 2024 14:18:21 -0700
      Finished:     Mon, 13 May 2024 14:18:23 -0700
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2xzn5 (ro)
Containers:
  temporal-matching:
    Container ID:   docker://df0ee0acf4e95d52c05a732b6f3c8f38ffe6e44e520408c93e0e9c5c45c85f0d
    Image:          temporalio/server:1.22.4
    Image ID:       docker-pullable://temporalio/server@sha256:c0a44c26397b51a2c83f7ce2b7e2375b2788fb6665ca4866003098d814d3b47a
    Ports:          7235/TCP, 9090/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 13 May 2024 16:01:23 -0700
      Finished:     Mon, 13 May 2024 16:01:23 -0700
    Ready:          False
    Restart Count:  25
    Liveness:       tcp-socket :rpc delay=150s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_IP:                               (v1:status.podIP)
      ENABLE_ES:
      ES_SEEDS:                            elasticsearch-master-headless
      ES_PORT:                             9200
      ES_VERSION:                          v7
      ES_SCHEME:                           http
      ES_VIS_INDEX:                        temporal_visibility_v1_dev
      ES_USER:
      ES_PWD:
      SERVICES:                            matching
      TEMPORAL_STORE_PASSWORD:             <set to the key 'password' in secret 'temporaltest-default-store'>     Optional: false
      TEMPORAL_VISIBILITY_STORE_PASSWORD:  <set to the key 'password' in secret 'temporaltest-visibility-store'>  Optional: false
    Mounts:
      /etc/temporal/config/config_template.yaml from config (rw,path="config_template.yaml")
      /etc/temporal/dynamic_config from dynamic-config (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2xzn5 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      temporaltest-matching-config
    Optional:  false
  dynamic-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      temporaltest-dynamic-config
    Optional:  false
  kube-api-access-2xzn5:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Warning  BackOff  9m30s (x466 over 107m)  kubelet  Back-off restarting failed container temporal-matching in pod temporaltest-matching-d8fc5c979-mpdpd_default(1721dff2-568f-4d95-a7e0-d91eac060876)
  Normal   Pulled   4m22s (x26 over 107m)   kubelet  Container image "temporalio/server:1.22.4" already present on machine

Looking at the logs for the same pod shows this error message towards the end of the log dump

[Fx] RUN	provide: go.temporal.io/server/temporal.ServerOptionsProvider()
[Fx] Error returned: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider
	/home/builder/temporal/temporal/fx.go:173:
config validation error: persistence config: datastore "visibility": must provide config for one and only one datastore: elasticsearch, cassandra, sql or custom store
[Fx] ERROR		Failed to initialize custom logger: could not build arguments for function "go.uber.org/fx".(*module).constructCustomLogger.func2
	/go/pkg/mod/go.uber.org/fx@v1.20.0/module.go:251:
failed to build fxevent.Logger:
could not build arguments for function "go.temporal.io/server/temporal".glob..func8
	/home/builder/temporal/temporal/fx.go:1037:
failed to build log.Logger:
received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider
	/home/builder/temporal/temporal/fx.go:173:
config validation error: persistence config: datastore "visibility": must provide config for one and only one datastore: elasticsearch, cassandra, sql or custom store
Unable to create server. Error: could not build arguments for function "go.uber.org/fx".(*module).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.20.0/module.go:251): failed to build fxevent.Logger: could not build arguments for function "go.temporal.io/server/temporal".glob..func8 (/home/builder/temporal/temporal/fx.go:1037): failed to build log.Logger: received non-nil error from function "go.temporal.io/server/temporal".ServerOptionsProvider (/home/builder/temporal/temporal/fx.go:173): config validation error: persistence config: datastore "visibility": must provide config for one and only one datastore: elasticsearch, cassandra, sql or custom store.

Is someone able to help me with this issue?

PS: I googled for helm-chart related errors and I got a couple of hits from a couple of years back, but the resolution suggested on them (for example git resetting head to a particular commit of the helm chart repo) dont think is applicable anymore.

@AkashShetty, I was able to fix the issue following this suggestion here: Pods Stuck in CrashLoopBackoff on Fresh Deployment to Fresh Kubernetes Cluster · Issue #470 · temporalio/helm-charts · GitHub

1 Like

Thank you!! I’ll take a look and see if the suggested fix works in my case.