Deploying in AWS - healthchecks

Does temporal server have any healthcheck endpoint? How can I set up the target groups to hit temporal’s gRPC endpoints?

I see from aws docs - Health checks for your target groups - Elastic Load Balancing

The destination for health checks on the targets.

If the protocol version is HTTP/1.1 or HTTP/2, specify a valid URI (/ *path* ? *query* ). The default is /.

If the protocol version is gRPC, specify the path of a custom health check method with the format `/Package.Class/method` . The default is `/AWS.ALB/healthcheck` .
1 Like

Probably the easiest way to go here is to pass a health check by opening a tcp connection to the grpc service endpoint - if you can connect its “healthy”

I’m not so familiar with gRPC. Coming from mostly HTTP RESTful endpoints. What’s the “grpc service endpoint” you are referring to?

I was expecting something equivalent to /healthcheck for http endpoints.

we don’t have a real grpc health check right now - so best recommendation is just to try and open a connection. the grpc service endpoint i’m talking about is just the temporal endpoint you’d point your client sdk to.

doh - so i was wrong here. temporal itself does have grpc health checks but we’re not using them in our helm charts and instead do a tcp check.

for health checks using target groups you can set the path to: /temporal.api.workflowservice.v1.WorkflowService/Check

My apologies for steering you wrong initially. I’ve added an issue to the helm chart repo to track this also - [Feature Request] Add gRPC heath check via grpc_health_probe · Issue #203 · temporalio/helm-charts · GitHub

Hey @derek , thanks for the follow up. I tried the path you mentioned, but it appears that the target group is failing. Not sure if that path exists.

I tried to look up api/service.proto at master · temporalio/api · GitHub to see if such a healthcheck endpoint exists but found nothing.

Here’s my terraform options I passed to https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lb_target_group :

path = "/temporal.api.workflowservice.v1.WorkflowService/Check"
matcher = "0"

After some investigation I think the right path would be:

/grpc.health.v1.Health/Check

If you set service field in request message to temporal.api.workflowservice.v1.WorkflowService. it should reply with:

{
  "status": "SERVING"
}

but even if you don’t, it looks like health check responds with 200 even if response message is

{
  "status": "SERVICE_UNKNOWN"
}
1 Like

thanks this works now!