Spring boot connected to temporal sever getting io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204

I am getting below errors consistently in spring boot logs. Though workflows are running fine but not sure why I am getting these errors

usRuntimeException: UNKNOWN: HTTP status code 204test-server    | 2023-05-02 22:28:24.913  WARN 1 --- [ce="default": 3] io.temporal.internal.worker.Poller       : Failure in poller thread Activity Poller taskQueue="operation_queue", namespace="default": 3
test-server    |
test-server    | io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
test-server    | invalid content-type: null
test-server    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Tue, 02 May 2023 22:28:24 GMT,grpc-status=14,grpc-message=unavailable)
test-server    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollActivityTaskQueue(WorkflowServiceGrpc.java:3801) ~[temporal-serviceclient-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:100) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:40) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
test-server    |
test-server    | 2023-05-02 22:28:25.573  WARN 1 --- [ce="default": 4] io.temporal.internal.worker.Poller       : Failure in poller thread Activity Poller taskQueue="operation_queue", namespace="default": 4
test-server    |
test-server    | io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
test-server    | invalid content-type: null
test-server    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Tue, 02 May 2023 22:28:25 GMT,grpc-status=14,grpc-message=unavailable)
test-server    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollActivityTaskQueue(WorkflowServiceGrpc.java:3801) ~[temporal-serviceclient-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:100) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:40) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
test-server    |
test-server    | 2023-05-02 22:28:27.204  WARN 1 --- [ce="default": 2] io.temporal.internal.worker.Poller       : Failure in poller thread Workflow Poller taskQueue="operation_queue", namespace="default": 2
test-server    |
test-server    | io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
test-server    | invalid content-type: null
test-server    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Tue, 02 May 2023 22:28:27 GMT,grpc-status=14,grpc-message=unavailable)
test-server    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollWorkflowTaskQueue(WorkflowServiceGrpc.java:3752) ~[temporal-serviceclient-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.WorkflowPollTask.doPoll(WorkflowPollTask.java:140) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:122) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:43) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]

I was getting different error when using ha-proxy in spring boot which is

temporal io io.grpc.StatusRuntimeException: UNAVAILABLE: HTTP status code 504

But I was able to resolve this with ha-proxy by increasing timeout

global
   log stdout format raw local0
   maxconn 50000

defaults
   timeout connect 10000ms
   timeout client 75s
   timeout server 75s
   timeout http-request 75s
   mode http
   maxconn 3000

frontend stats
   bind *:8404
   http-request use-service prometheus-exporter if { path /metrics }
   stats enable
   stats uri /stats
   stats refresh 10s

frontend www
   mode http
   bind :7233 proto h2
   default_backend servers

backend servers
   mode http
   balance roundrobin
   server f1 temporal-frontend:7237 proto h2
   server f2 temporal-frontend2:7236 proto h2

increase timeout
from

   timeout client 60000ms
   timeout server 60000ms

to

   timeout client 75s
   timeout server 75s

I think there might be some property which we need to put in nginx to solve this with nginx
See repo Nginx upstream timed out (110: Connection timed out) while reading response header from upstream · Issue #1 · tsurdilo/my-temporal-dockercompose · GitHub

I think there might be some property which we need to put in nginx to solve this with nginx

Did you try setting keepalive_timeout for nginx?
Will update haproxy config in that repo to include client and server timeout setting.

For nginx I increased keepalive_timeout: 600s but still same issue. I am still getting error in spring boot as

 io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
commerce-be    | invalid content-type: null
commerce-be    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Thu, 15 Jun 2023 21:30:07 GMT,grpc-status=14,grpc-message=unavailable)
commerce-be    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
commerce-be    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
commerce-be    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
commerce-be    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollWorkflowTaskQueue(WorkflowServiceGrpc.java:3752) ~[temporal-serviceclient-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.WorkflowPollTask.doPoll(WorkflowPollTask.java:140) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:122) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:43) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
commerce-be    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker

nginx container

temporal-nginx            | 172.23.0.2 - - [15/Jun/2023:21:31:07 +0000] "POST /temporal.api.workflowservice.v1.WorkflowService/PollWorkflowTaskQueue HTTP/2.0" 204 0 "-" "grpc-java-netty/1.53.0"
temporal-nginx            | 2023/06/15 21:31:07 [error] 22#22: *15 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.23.0.2, server: , request: "POST /temporal.api.workflowservice.v1.WorkflowService/PollWorkflowTaskQueue HTTP/2.0", upstream: "grpc://172.23.0.16:7236", host: "temporal-nginx:7233"

I tried increasing other timeouts too

http {
    grpc_connect_timeout 300;
    proxy_read_timeout 1d;
    proxy_connect_timeout 1d;
    proxy_send_timeout 1d;

    upstream frontend_hosts {
        server temporal-frontend:7237 max_fails=0;
        server temporal-frontend2:7236 max_fails=0;

        keepalive 200;
        keepalive_time 1d;
        keepalive_timeout 300;
        keepalive_requests 100000;
    }

    server {
        listen 7233 http2;
        location / {
            grpc_pass grpc://frontend_hosts;
            proxy_set_header Connection "";
            proxy_http_version 1.1;
            proxy_read_timeout 3600;
        }

hello,
i get the same problem.
did you resolve it?

Yes, with haproxy i was able to make it work by increasing timeouts, as seen above in solution. But was never able to got it working with nginx

thank you.

For me adding this configuration helped (nginx.conf → http section):

    grpc_connect_timeout 75s;
    grpc_read_timeout 300s;
    grpc_send_timeout 300s;
    client_body_timeout 300s;