Spring boot connected to temporal sever getting io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204

I am getting below errors consistently in spring boot logs. Though workflows are running fine but not sure why I am getting these errors

usRuntimeException: UNKNOWN: HTTP status code 204test-server    | 2023-05-02 22:28:24.913  WARN 1 --- [ce="default": 3] io.temporal.internal.worker.Poller       : Failure in poller thread Activity Poller taskQueue="operation_queue", namespace="default": 3
test-server    |
test-server    | io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
test-server    | invalid content-type: null
test-server    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Tue, 02 May 2023 22:28:24 GMT,grpc-status=14,grpc-message=unavailable)
test-server    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollActivityTaskQueue(WorkflowServiceGrpc.java:3801) ~[temporal-serviceclient-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:100) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:40) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
test-server    |
test-server    | 2023-05-02 22:28:25.573  WARN 1 --- [ce="default": 4] io.temporal.internal.worker.Poller       : Failure in poller thread Activity Poller taskQueue="operation_queue", namespace="default": 4
test-server    |
test-server    | io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
test-server    | invalid content-type: null
test-server    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Tue, 02 May 2023 22:28:25 GMT,grpc-status=14,grpc-message=unavailable)
test-server    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollActivityTaskQueue(WorkflowServiceGrpc.java:3801) ~[temporal-serviceclient-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:100) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.ActivityPollTask.poll(ActivityPollTask.java:40) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
test-server    |
test-server    | 2023-05-02 22:28:27.204  WARN 1 --- [ce="default": 2] io.temporal.internal.worker.Poller       : Failure in poller thread Workflow Poller taskQueue="operation_queue", namespace="default": 2
test-server    |
test-server    | io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
test-server    | invalid content-type: null
test-server    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Tue, 02 May 2023 22:28:27 GMT,grpc-status=14,grpc-message=unavailable)
test-server    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
test-server    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollWorkflowTaskQueue(WorkflowServiceGrpc.java:3752) ~[temporal-serviceclient-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.WorkflowPollTask.doPoll(WorkflowPollTask.java:140) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:122) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:43) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[na:na]
test-server    | 	at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]

I was getting different error when using ha-proxy in spring boot which is

temporal io io.grpc.StatusRuntimeException: UNAVAILABLE: HTTP status code 504

But I was able to resolve this with ha-proxy by increasing timeout

global
   log stdout format raw local0
   maxconn 50000

defaults
   timeout connect 10000ms
   timeout client 75s
   timeout server 75s
   timeout http-request 75s
   mode http
   maxconn 3000

frontend stats
   bind *:8404
   http-request use-service prometheus-exporter if { path /metrics }
   stats enable
   stats uri /stats
   stats refresh 10s

frontend www
   mode http
   bind :7233 proto h2
   default_backend servers

backend servers
   mode http
   balance roundrobin
   server f1 temporal-frontend:7237 proto h2
   server f2 temporal-frontend2:7236 proto h2

increase timeout
from

   timeout client 60000ms
   timeout server 60000ms

to

   timeout client 75s
   timeout server 75s

I think there might be some property which we need to put in nginx to solve this with nginx
See repo Nginx upstream timed out (110: Connection timed out) while reading response header from upstream · Issue #1 · tsurdilo/my-temporal-dockercompose · GitHub

I think there might be some property which we need to put in nginx to solve this with nginx

Did you try setting keepalive_timeout for nginx?
Will update haproxy config in that repo to include client and server timeout setting.

For nginx I increased keepalive_timeout: 600s but still same issue. I am still getting error in spring boot as

 io.grpc.StatusRuntimeException: UNKNOWN: HTTP status code 204
commerce-be    | invalid content-type: null
commerce-be    | trailers: Metadata(:status=204,server=nginx/1.22.1,date=Thu, 15 Jun 2023 21:30:07 GMT,grpc-status=14,grpc-message=unavailable)
commerce-be    | 	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.53.0.jar:1.53.0]
commerce-be    | 	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.53.0.jar:1.53.0]
commerce-be    | 	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.53.0.jar:1.53.0]
commerce-be    | 	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollWorkflowTaskQueue(WorkflowServiceGrpc.java:3752) ~[temporal-serviceclient-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.WorkflowPollTask.doPoll(WorkflowPollTask.java:140) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:122) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:43) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.Poller$PollExecutionTask.run(Poller.java:298) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at io.temporal.internal.worker.Poller$PollLoopTask.run(Poller.java:258) ~[temporal-sdk-1.19.1.jar:na]
commerce-be    | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[na:na]
commerce-be    | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker

nginx container

temporal-nginx            | 172.23.0.2 - - [15/Jun/2023:21:31:07 +0000] "POST /temporal.api.workflowservice.v1.WorkflowService/PollWorkflowTaskQueue HTTP/2.0" 204 0 "-" "grpc-java-netty/1.53.0"
temporal-nginx            | 2023/06/15 21:31:07 [error] 22#22: *15 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.23.0.2, server: , request: "POST /temporal.api.workflowservice.v1.WorkflowService/PollWorkflowTaskQueue HTTP/2.0", upstream: "grpc://172.23.0.16:7236", host: "temporal-nginx:7233"

I tried increasing other timeouts too

http {
    grpc_connect_timeout 300;
    proxy_read_timeout 1d;
    proxy_connect_timeout 1d;
    proxy_send_timeout 1d;

    upstream frontend_hosts {
        server temporal-frontend:7237 max_fails=0;
        server temporal-frontend2:7236 max_fails=0;

        keepalive 200;
        keepalive_time 1d;
        keepalive_timeout 300;
        keepalive_requests 100000;
    }

    server {
        listen 7233 http2;
        location / {
            grpc_pass grpc://frontend_hosts;
            proxy_set_header Connection "";
            proxy_http_version 1.1;
            proxy_read_timeout 3600;
        }