Issues deploying Temporal on Google Kubernetes Engine Autopilot Mode

Hello,

I’m trying to deploy temporal on Google Kubernetes Engine in Autopilot mode but am running into some issues. I’m new to Temporal, Kubernetes, and GKE, so apologies if these are obvious mistakes to make.

tl;dr I ran the command “helm install -f values/values.cloudsqlproxy.yaml temporaltest . --timeout 900s” and got the error:

Error: INSTALLATION FAILED: 1 error occurred:
        * admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-disallow-privilege]":["container configure-sysctl is privileged; not allowed in Autopilot"]}

I’ll list all of the steps I took up to getting the error from “helm install” just in case it highlights issues with what I’m doing.

  1. Create the temporal database in Cloud SQL
    1a. Create a Cloud SQL instance
    - From the GCP Cloud SQL dashboard, I clicked “Create an instance.”
    - I clicked “Choose MySQL”.
    - I entered instance ID “temporal-test”.
    - I entered a password.
    - I selected MySQL 8.0 as the database version.
    - Under Connections > Authorized networks, I clicked “Add a network”.
    - I entered the name as “temporal-network”.
    - I entered the network as “0.0.0.0/0” which will allow any IPv4 client to pass the network firewall (but they’ll still need credentials to connect).
    - I clicked “Create instance”.
    - Once the instance was created, I clicked on it and went to the “Connections” tab on the left.
    - I went to the “Security” tab on top.
    - Under “Manage client certificates”, I clicked “Create Client Certificate”.
    - I named it “temporal-test-cert”.
    - I clicked “Create”
    - I downloaded the client-key.perm, client-cert.perm, and server-ca.perm to my computer. These will let me connect to the Cloud SQL instance from my machine.
    - I clicked “Close”.

    1b. Clone the temporalio/temporal git repo and use its temporal-sql-tool to create the temporal databases inside my Cloud SQL instance.
    - From my computer terminal (for me this was inside a development container in VSCode that had Python3 and Go installed), I cloned the “temporalio/temporal.git” repository.
    - I ran “cd temporal”.
    - I ran “make temporal-sql-tool”.
    - For the next set of commands using temporal-sql-tool, I passed in environment variables so that it can communicate with Cloud SQL. I passed in:
    - SQL_PLUGIN=“mysql”
    - SQL_HOST=“[ip address of the Cloud SQL instance visible from the Cloud SQL dashboard]”
    - SQL_PORT=“3306”
    - SQL_USER=“root”
    - SQL_PASSWORD=“[my Cloud SQL instance password]”
    - SQL_TLS=TRUE
    - SQL_TLS_CERT_FILE=“</absolute/path/to/client-cert.pem>”
    - SQL_TLS_KEY_FILE=“</absolute/path/to/client-key.pem>”
    - SQL_TLS_CA_FILE=“</absolute/path/to/server-ca.pem>”
    - SQL_TLS_SERVER_NAME=“[Cloud SQL connection name visible from the Cloud SQL dashboard]”
    - SQL_TLS_DISABLE_HOST_VERIFICATION=“true”
    - I ran “./temporal-sql-tool --database temporal create-database”
    - I ran “SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -v 0.0”
    - I ran “SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/mysql/v57/temporal/versioned”
    - I ran “./temporal-sql-tool --database temporal_visibility create-database”
    - I ran “SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -v 0.0”
    - I ran “SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/mysql/v57/visibility/versioned”

  2. Create the temporal cluster in GKE
    2a. Created an Autopilot GKE cluster
    - From the GCP Kubernetes Engine dashboard, I clicked “Create”. By default, the cluster will be in “Autopilot mode,” so I left that as is.
    - I named the cluster “temporal-test”
    - I selected my region as “us-central1”.
    - I clicked on the Next: Networking button.
    - Under IPv4 network access, I selected “Private cluster”
    - I clicked “Create”

    2b. Clone the temporalio/helm-charts git repo and use its helm chart to deploy temporal on my GKE cluster.
    - I installed helm by running “curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash”
    - I installed kubectl by running
    - “curl -LO https://dl.k8s.io/release/v1.26.0/bin/linux/arm64/kubectl
    - “sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl”
    - I made sure I an connecto GKE by running:
    - “gcloud auth login”
    - “gcloud config set project ”
    - I ran “gcloud components install gke-gcloud-auth-plugin” which installs a plugin for kubetcl that lets it connect with GKE.
    - I ran “gcloud container clusters get-credentials --zone --project ”. I think this updates a local config file so that helm can talk to GKE.
    - I cloned the “temporalio/helm-charts.git” repository.
    - I ran “cd helm-charts”
    - I ran “helm dependencies update”
    - I opened the “helm-charts/values/values.cloudsqlproxy.yaml” and replaced all of the “PROJECTNAME”, “REGION” and “INSTANCENAME” with what matches my Cloud SQL instance.
    - I ran “helm install -f values/values.cloudsqlproxy.yaml temporaltest . --timeout 900s”

The output from the “helm install” command was:

coalesce.go:220: warning: cannot overwrite table with non table for temporal.server.sidecarContainers (map[])
coalesce.go:220: warning: cannot overwrite table with non table for prometheus.server.sidecarContainers (map[])
coalesce.go:220: warning: cannot overwrite table with non table for temporal.server.sidecarContainers (map[])
W0608 15:08:07.123707    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-web, as resource requests were not specified
W0608 15:08:07.123713    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-kube-state-metrics, as resource requests were not specified
W0608 15:08:07.164563    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-admintools, as resource requests were not specified
W0608 15:08:07.252825    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-matching, as resource requests were not specified
W0608 15:08:07.256371    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-history, as resource requests were not specified
W0608 15:08:07.280053    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-prometheus-pushgateway, as resource requests were not specified
W0608 15:08:07.294182    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-prometheus-alertmanager, as resource requests were not specified
W0608 15:08:07.323612    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-prometheus-server, as resource requests were not specified
W0608 15:08:07.356937    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-worker, as resource requests were not specified
W0608 15:08:07.381749    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-frontend, as resource requests were not specified
W0608 15:08:07.397589    6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-grafana, as resource requests were not specified
W0608 15:08:07.605365    6930 warnings.go:70] Autopilot set default resource requests for StatefulSet default/temporaltest-cassandra, as resource requests were not specified
W0608 15:08:07.610665    6930 warnings.go:70] Autopilot set default resource requests on StatefulSet default/elasticsearch-master for container configure-sysctl, as resource requests were not specified, and adjusted resource requests to meet requirements
Error: INSTALLATION FAILED: 1 error occurred:
        * admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-disallow-privilege]":["container configure-sysctl is privileged; not allowed in Autopilot"]}

Does this mean that I can’t use GKE in Autopilot mode with Temporal? I also tried following the same steps, but selecting “Standard” mode when creating the GKE cluster. When I did that, I didn’t get errors from helm install but when I went to the “Workloads” tab on the left inside the Kubernetes Engine dashboard, I saw a bunch of status errors that said “Does not have minimum availability?” Does this mean that I need to increase resources? I left everything as default when creating the GKE cluster, so I’m not sure what the minimum requirements are for temporal. Any help with these errors and also any feedback on the process above in general would be greatly appreciated. Thank you!

Hey, I have the same problem. I’m trying to deploy Temporal cluster on GKE Autopilot using the official helms chart.
I’m curious did you solve this problem?

No, I was not able to solve it. However, I did switch to Temporal Cloud which was much easier to set up. I would recommend that. They can give you free credits up front so that you can try it out and see if it works for you.

Today I just (almost *) solved this issue.

  1. You need to switch GKE Standard Environment.
  2. Try to deploy using helm chart GitHub - temporalio/helm-charts: Temporal Helm charts
  3. Deploy everything except elastic search.
  4. If you will be asked about permission: kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user your_account@example.com

It will work, and then you can deploy elastic search separately.

Almost * solved because the issue is still with GKE Autopilot, but I did not try deployment without ElasticSearch, so it might solve the issue on GKE as well.