Hi,
worker service in our application fails to come up properly and it fails while polling activity queue and workflow queue with gRPC errors:
We are using java temporal-sdk version 1.0.7 with temporal server version 1.23.1. We had to upgrade the server version to fix a few security vulnerabilities and couldn’t upgrade SDK yet.
20-05-2024 20:57:46.271 UTC+0000 ERROR [temporalworker,] [Activity Poller taskQueue=“ActivityQueue”, namespace=“default”: 4] ERROR io.temporal.internal.worker.Poller.lambda$new$0 - Failure in thread Activity Poller taskQueue=“ActivityQueue”, namespace=“default”: 4
io.grpc.StatusRuntimeException: UNAVAILABLE: Not enough hosts to serve the request
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollActivityTaskQueue(WorkflowServiceGrpc.java:2683)
at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:105)
at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:39)
at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:265)
at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:241)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
20-05-2024 20:57:46.320 UTC+0000 ERROR [temporalworker,] [Workflow Poller taskQueue=“WorkflowQueue”, namespace=“default”: 1] ERROR io.temporal.internal.worker.Poller.lambda$new$0 - Failure in thread Workflow Poller taskQueue=“WorkflowQueue”, namespace=“default”: 1
io.grpc.StatusRuntimeException: UNAVAILABLE: Not enough hosts to serve the request
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollWorkflowTaskQueue(WorkflowServiceGrpc.java:2639)
at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:81)
at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:37)
at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:265)
at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:241)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Is there a way to add this as a readiness probe? We currently check the health using the below condition.
tctl --address temporal-front-end-service-name-here:7233 cluster health
Note that this issue is not seen always, and we see it we try to install multiple applications on a kubernetes cluster used as test environment. One of 8 applications fail to come up properly with this error.
We started noticing this error only recently during our software stack version upgrade which included grpc library version upgrade and temporal server upgrade.