Background to my question:
Once in a while when I try to use temporal, when trying to run a workflow I get this error: “no hosts are available to serve the request”.
I’m really confused with this because my cluster seems quite healthy. If I run:
tctl --address temporal.b8s.biz:7233 cluster health
> temporal.api.workflowservice.v1.WorkflowService: SERVING
If i try to view my logs I see this:
temporal-worker
2022/06/06 00:01:42 INFO
Fetching price quote Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ ActivityID 11 Activity
Type Quote Attempt 1 WorkflowType ExecuteTrade WorkflowID 930f6df-52ad-42f1-bd6e-6a1d29671d3 RunID 5437c43-2686-456-9493-34110565ee6
2022/06/06 00:01:52 INFO Task processing failed with error Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerTyp e ActivityWorker Error context deadline exceeded
2022/06/06 00:02:52 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00:04:02 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00:07:12 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00:09:21 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00:12:40 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00:13:51 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00: 15:01 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
2022/06/06 00:16:12 WARN Failed to poll for task. Namespace default TaskQueue internal-executor WorkerID 1@internal-executor-d4695c6d9-vd2x7@ WorkerType Activit Worker Error context deadline exceeded
temporal frontend
{"level": "error", "ts": "2022-06-06T00:20:50.4237", "msg": "Unable to call matching. PollWorkflowTaskQueue.", "service": "frontend", "wf-task-queue-name":"client-reporti ng", "timeout": "Im9.999685407s","error":"context deadline exceeded", "logging-call-at": "workflowHandler.go:808","stacktrace":"go.temporal.io/server/common/log.(*za pLogger).Error\n\t/temporal/common/log/zap_logger.go:142\ngo.temporal.io/server/service/frontend.(*WorkflowHandler).PollWorkflowTaskQueue\n\t/temporal/service/fr ontend/workflowHandler.go:808\ngo.temporal.io/server/service/frontend.(*DCRedirectionHandlerImpl).PollWorkflowTaskQueue.func2\n\t/temporal/service/frontend/dcRed irectionHandler.go:540\ngo.temporal.io/server/service/frontend.(*NoopRedirectionPolicy).WithNamespaceRedirect\n\t/temporal/service/frontend/dcRedirectionPolicy.g o:118\ngo. temporal.io/server/service/frontend.(*DCRedirectionHandlerImpl).PollWorkflowTaskQueue\n\t/temporal/service/frontend/dcRedirectionHandler.go:536\ngo.tem poral.io/api/workflowservice/vl._WorkflowService_PollWorkflowTaskQueue_Handler.func1\n\t/go/pkg/mod/go.temporal.io/api@v1.7.0/workflowservice/vl/service.pb.go:11 40\ngo.temporal.io/server/common/authorization.(*interceptor).Interceptor\n\t/temporal/common/authorization/interceptor.go:152\ngoogle.golang.org/grpc.chainUnary Interceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:1116\ngo.temporal.io/server/common/rpc/interceptor.(*NamespaceCountLimitInterceptor)
.Intercept\n\t/temporal/common/rpc/interceptor/namespace_count_limit.go:98\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.or g/grpc@v1.42.0/server.go:1119\ngo.temporal.io/server/common/rpc/interceptor.(*NamespaceRateLimitInterceptor).Intercept\n\t/temporal/common/rpc/interceptor/namesp ace_rate_limit.go:88\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:1119\ngo.temporal.io/server/c ommon/rpc/interceptor.(*RateLimitInterceptor).Intercept\n\t/temporal/common/rpc/interceptor/rate_limit.go:83\ngoogle.golang.org/grpc.chainUnaryInterceptors.funci
.1\n\t/qo/pkg/mod/google.golang.org/grpc@v1.42.0/server.qo:1119\ngo.temporal.io/server/common/rpc/interceptor.(*NamespaceValidatorInterceptor).Intercept\n\t/temp oral/common/rpc/interceptor/namespace_validator.go:113\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/serve r.go:1119\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept\n\t/temporal/common/rpc/interceptor/telemetry.go:108\ngoogle.golang.org /grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:1119\ngo.temporal.io/server/common/metrics.NewServerMetricsContextIn jectorInterceptor.func1\n\t/temporal/common/metrics/grpc.go:66\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.42
O/server.go:1119\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor\n\t/temporal/common/rpc/grpc.go:131\ngoogle.golang.org/grpc.chainUnaryInterceptors.fu nc1.1\n\t/go/pkg/mod/gooqle.golang.org/grpc@v1.42.0/server.go:1119\ngo.temporal.io/server/common/rpc/interceptor.(*NamespaceLoqInterceptor).Intercept\n\t/tempora 1/common/rpc/interceptor/namespace_logger.go:84\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\nlt/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:11
19\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:1121\ngo.temporal.io/api/workflowservice/vl._Work flowService_PollWorkflowTaskQueue_Handler\n\t/go/pkg/mod/go.temporal.io/api@v1.7.0/workflowservice/vl/service.pb.go:1142\ngoogle.golang.org/grpc.(*Server).proces sUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:1282\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/qrpc@v1.42
O/server.go:1616\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.42.0/server.go:921"}
I can see clearly somethings wrong as the temporal worker can’t poll for requests, and the frontend has these context deadline exceeded errors. I have 2 workers, and 2 frontends (temporal installed via helm). I don’t really have too many tasks. I can’t quite figure out where the ‘jam’ is coming from.
Is there a way to figure out why it’s doing this? Additional is there a way to find out how resource constrained the workers/frontend is? Also is there any reason it could be behaving this way (even though i don’t have too many tasks?)