Hello,
I’m trying to deploy temporal on Google Kubernetes Engine in Autopilot mode but am running into some issues. I’m new to Temporal, Kubernetes, and GKE, so apologies if these are obvious mistakes to make.
tl;dr I ran the command “helm install -f values/values.cloudsqlproxy.yaml temporaltest . --timeout 900s” and got the error:
Error: INSTALLATION FAILED: 1 error occurred:
* admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-disallow-privilege]":["container configure-sysctl is privileged; not allowed in Autopilot"]}
I’ll list all of the steps I took up to getting the error from “helm install” just in case it highlights issues with what I’m doing.
-
Create the temporal database in Cloud SQL
1a. Create a Cloud SQL instance
- From the GCP Cloud SQL dashboard, I clicked “Create an instance.”
- I clicked “Choose MySQL”.
- I entered instance ID “temporal-test”.
- I entered a password.
- I selected MySQL 8.0 as the database version.
- Under Connections > Authorized networks, I clicked “Add a network”.
- I entered the name as “temporal-network”.
- I entered the network as “0.0.0.0/0” which will allow any IPv4 client to pass the network firewall (but they’ll still need credentials to connect).
- I clicked “Create instance”.
- Once the instance was created, I clicked on it and went to the “Connections” tab on the left.
- I went to the “Security” tab on top.
- Under “Manage client certificates”, I clicked “Create Client Certificate”.
- I named it “temporal-test-cert”.
- I clicked “Create”
- I downloaded the client-key.perm, client-cert.perm, and server-ca.perm to my computer. These will let me connect to the Cloud SQL instance from my machine.
- I clicked “Close”.1b. Clone the temporalio/temporal git repo and use its temporal-sql-tool to create the temporal databases inside my Cloud SQL instance.
- From my computer terminal (for me this was inside a development container in VSCode that had Python3 and Go installed), I cloned the “temporalio/temporal.git” repository.
- I ran “cd temporal”.
- I ran “make temporal-sql-tool”.
- For the next set of commands using temporal-sql-tool, I passed in environment variables so that it can communicate with Cloud SQL. I passed in:
- SQL_PLUGIN=“mysql”
- SQL_HOST=“[ip address of the Cloud SQL instance visible from the Cloud SQL dashboard]”
- SQL_PORT=“3306”
- SQL_USER=“root”
- SQL_PASSWORD=“[my Cloud SQL instance password]”
- SQL_TLS=TRUE
- SQL_TLS_CERT_FILE=“</absolute/path/to/client-cert.pem>”
- SQL_TLS_KEY_FILE=“</absolute/path/to/client-key.pem>”
- SQL_TLS_CA_FILE=“</absolute/path/to/server-ca.pem>”
- SQL_TLS_SERVER_NAME=“[Cloud SQL connection name visible from the Cloud SQL dashboard]”
- SQL_TLS_DISABLE_HOST_VERIFICATION=“true”
- I ran “./temporal-sql-tool --database temporal create-database”
- I ran “SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -v 0.0”
- I ran “SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/mysql/v57/temporal/versioned”
- I ran “./temporal-sql-tool --database temporal_visibility create-database”
- I ran “SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -v 0.0”
- I ran “SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/mysql/v57/visibility/versioned” -
Create the temporal cluster in GKE
2a. Created an Autopilot GKE cluster
- From the GCP Kubernetes Engine dashboard, I clicked “Create”. By default, the cluster will be in “Autopilot mode,” so I left that as is.
- I named the cluster “temporal-test”
- I selected my region as “us-central1”.
- I clicked on the Next: Networking button.
- Under IPv4 network access, I selected “Private cluster”
- I clicked “Create”2b. Clone the temporalio/helm-charts git repo and use its helm chart to deploy temporal on my GKE cluster.
- I installed helm by running “curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash”
- I installed kubectl by running
- “curl -LO https://dl.k8s.io/release/v1.26.0/bin/linux/arm64/kubectl”
- “sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl”
- I made sure I an connecto GKE by running:
- “gcloud auth login”
- “gcloud config set project ”
- I ran “gcloud components install gke-gcloud-auth-plugin” which installs a plugin for kubetcl that lets it connect with GKE.
- I ran “gcloud container clusters get-credentials --zone --project ”. I think this updates a local config file so that helm can talk to GKE.
- I cloned the “temporalio/helm-charts.git” repository.
- I ran “cd helm-charts”
- I ran “helm dependencies update”
- I opened the “helm-charts/values/values.cloudsqlproxy.yaml” and replaced all of the “PROJECTNAME”, “REGION” and “INSTANCENAME” with what matches my Cloud SQL instance.
- I ran “helm install -f values/values.cloudsqlproxy.yaml temporaltest . --timeout 900s”
The output from the “helm install” command was:
coalesce.go:220: warning: cannot overwrite table with non table for temporal.server.sidecarContainers (map[])
coalesce.go:220: warning: cannot overwrite table with non table for prometheus.server.sidecarContainers (map[])
coalesce.go:220: warning: cannot overwrite table with non table for temporal.server.sidecarContainers (map[])
W0608 15:08:07.123707 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-web, as resource requests were not specified
W0608 15:08:07.123713 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-kube-state-metrics, as resource requests were not specified
W0608 15:08:07.164563 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-admintools, as resource requests were not specified
W0608 15:08:07.252825 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-matching, as resource requests were not specified
W0608 15:08:07.256371 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-history, as resource requests were not specified
W0608 15:08:07.280053 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-prometheus-pushgateway, as resource requests were not specified
W0608 15:08:07.294182 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-prometheus-alertmanager, as resource requests were not specified
W0608 15:08:07.323612 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-prometheus-server, as resource requests were not specified
W0608 15:08:07.356937 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-worker, as resource requests were not specified
W0608 15:08:07.381749 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-frontend, as resource requests were not specified
W0608 15:08:07.397589 6930 warnings.go:70] Autopilot set default resource requests for Deployment default/temporaltest-grafana, as resource requests were not specified
W0608 15:08:07.605365 6930 warnings.go:70] Autopilot set default resource requests for StatefulSet default/temporaltest-cassandra, as resource requests were not specified
W0608 15:08:07.610665 6930 warnings.go:70] Autopilot set default resource requests on StatefulSet default/elasticsearch-master for container configure-sysctl, as resource requests were not specified, and adjusted resource requests to meet requirements
Error: INSTALLATION FAILED: 1 error occurred:
* admission webhook "gkepolicy.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-disallow-privilege]":["container configure-sysctl is privileged; not allowed in Autopilot"]}
Does this mean that I can’t use GKE in Autopilot mode with Temporal? I also tried following the same steps, but selecting “Standard” mode when creating the GKE cluster. When I did that, I didn’t get errors from helm install but when I went to the “Workloads” tab on the left inside the Kubernetes Engine dashboard, I saw a bunch of status errors that said “Does not have minimum availability?” Does this mean that I need to increase resources? I left everything as default when creating the GKE cluster, so I’m not sure what the minimum requirements are for temporal. Any help with these errors and also any feedback on the process above in general would be greatly appreciated. Thank you!