Unable to run temporal in Knative/Google Cloud Run

Hi, I am trying to run temporal/auto-setup:1.0.0 in Google Cloud Run with port 7233.
It runs successfully but the health check never completes, although google cloud run supports unary gRPC endpoints. Health check:

  conditions:
  - type: Ready
    status: 'False'
    reason: HealthCheckContainerError
    message: |-
      Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.

    lastTransitionTime: '2020-10-25T10:40:49.621581Z'
  - type: ConfigurationsReady
    status: 'False'
    reason: HealthCheckContainerError
    message: |-
      Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.

    lastTransitionTime: '2020-10-25T10:40:49.621581Z'
  - type: RoutesReady
    status: Unknown
    message: |-
      Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.

    lastTransitionTime: '2020-10-25T10:40:49.690787Z'

Logs: https://pastebin.com/raw/3sCY0trQ

I would recommend using the supported helm chart to run the service in k8s. It opens all the ports and wires all the dependencies out of the box.

Hmm, I understand, but using barebone Kubernetes is too much of an operations overhead for me right now.

We did set up Temporal on AWS ECS. @manu did the work and might be able to help you.

Hey @hazcod,

Unfortunately I do not have any experience with any Google Cloud primitives. The pastebin logs you’ve supplied indicate the container is getting shutdown from the outside (probably because the health check sees that 7233 is not available and tells the container to terminate). It does appear like the actual Temporal services started up without any issue…

Some possible debug suggestions:

  • Get a shell into the container and make sure you can connect to 7233 locally.
  • Get a shell into another container on the same network and make sure you can connect to the other container’s 7233.
  • Temporarily disable the health check and see if you can reach 7233 externally.

Hopefully the above can provide additional information as to what is / isn’t functioning.

Thanks,
Manu

that is a very interesting proposal actually, @maxim, it’s kind of feature request to implement a deployment similar to auto-setup where all 4 services running and PORT variable is exposed, have you thought of making temporal serverless/knative? Yes, i do understand that some of the components require to run continuously to process workflow but that is a tradeoff, some of components like frontend are completely stateless and do not require to run all the time

The only thing in my head why this might not be great is that you want to have a persistent connection to temporal server for latency reasons, and a function will typically end after a couple of minutes.