GRPC message size limit

Pedro_Almeida · December 4, 2020, 12:49pm

Hello,

We came across this error recently :

*serviceerror.ResourceExhausted=grpc: received message larger than max (4473689 vs. 4194304))

Looks like GRPC is limited by default to 4mb.

Seems this is already a feature request to be able to change the limit.

github.com/temporalio/temporal

Support for setting max payload size

opened 07:06AM - 09 Nov 20 UTC

closed 08:10PM - 04 Jul 21 UTC

hazcod

enhancement

**Is your feature request related to a problem? Please describe.** Sending off …my HTTP request for processing results in an error. ```go Traceback (most recent call last): File "github.com/hazcod/foo/proxy/service/process_response.go", line 40, in service.storeRequestResponse File "runtime/asm_amd64.s", line 1374, in runtime.goexit serviceerror.ResourceExhausted: grpc: received message larger than max (13335484 vs. 4194304) ```

Is this change in the near future roadmap?

Note that this happened with workflows that had a lot of events. We are already trying to keep activities args and results as small as possible.

Thanks for the help!

maxim · December 4, 2020, 4:32pm

Which API call does give this error?

Pedro_Almeida · December 4, 2020, 5:11pm

Not an API call. Worker task :

Here the complete error message :

Failed to process workflow task.%!(EXTRA string=Namespace, string=default, string=TaskQueue, string=niko-worker, string=WorkerID, string=6@9631965fc1f2@, string=WorkflowType, string=ProcessOrder, string=WorkflowID, string=niko-order-1100679853, string=RunID, string=f6aaffcb-a2eb-4bae-9a59-b0192e46f560, string=Error, *serviceerror.ResourceExhausted=grpc: received message larger than max (6776550 vs. 4194304))

We also can see the same error, when we scroll down in the Web UI for the history of that workflow :

maxim · December 4, 2020, 5:16pm

How many parallel activities or child workflows are you trying to execute?

Pedro_Almeida · December 4, 2020, 5:23pm

In some steps of the workflow, we create up to 60 futures for activities and wait for them to complete.

We have 0 subworkflows

We have a separate go routine listening for signals that process the signals sequentially.

maxim · December 4, 2020, 5:50pm

So you have 60 activities that together have input over 6mb. I would try to limit the size of their inputs.

Pedro_Almeida · December 7, 2020, 6:43pm

Looking into this, i believe the issue is not in the 60 parallel activities, but rather in the signals listener go-routine.

The listener is in an infinite loop waiting for signals to process. For each signal, it will eval changes to a struct and trigger an activity that we call “Upsert” to send the updated version of that struct to an external API. This activity receives the struct as argument.

Created a sample code that simulate this in my local env

1 - add a go routine with a listener to a Signal channel and loops forever. Added a WaitGroup to prevent Main Workflow from completing
2 - Signal listener handler executes an activity passing a bigger struct ( ex : 100 kb ) as argument.
3 - The Activity return nil

Script a loop to send 500 signals. After some of the signals are sent, the worker suddenly stops processing the events.

In the UI, the workflow status is Terminated

On server, can see this error :

temporal_1 | {“level”:“error”,“ts”:“2020-12-07T18:22:59.087Z”,“msg”:“history size exceeds error limit.”,“service”:“history”,“shard-id”:1,“address”:“172.18.0.3:7234”,“shard-item”:“0xc0004a5580”,“component”:“history-cache”,“wf-namespace-id”:“b0908e9c-8fbe-49f2-88b5-18c06e448a1e”,“wf-id”:“hello_world_workflowID”,“wf-run-id”:“27e91634-bca7-4f40-81e2-c1450dc55c0a”,“wf-history-size”:52466599,“wf-event-count”:3628,“logging-call-at”:“workflowExecutionContext.go:1207”,“stacktrace”:“go.temporal.io/server/common/log/loggerimpl.(*loggerImpl).Error\n\t/temporal/common/log/loggerimpl/logger.go:138\ngo.temporal.io/server/service/history.(*workflowExecutionContextImpl).enforceSizeCheck\n\t/temporal/service/history/workflowExecutionContext.go:1207\ngo.temporal.io/server/service/history.(*workflowExecutionContextImpl).updateWorkflowExecutionAsActive\n\t/temporal/service/history/workflowExecutionContext.go:596\ngo.temporal.io/server/service/history.(*historyEngineImpl).updateWorkflowHelper\n\t/temporal/service/history/historyEngine.go:2438\ngo.temporal.io/server/service/history.(*historyEngineImpl).updateWorkflowExecutionWithAction\n\t/temporal/service/history/historyEngine.go:2392\ngo.temporal.io/server/service/history.(*historyEngineImpl).updateWorkflowExecution\n\t/temporal/service/history/historyEngine.go:2462\ngo.temporal.io/server/service/history.(*historyEngineImpl).RecordActivityTaskStarted\n\t/temporal/service/history/historyEngine.go:1308\ngo.temporal.io/server/service/history.(*Handler).RecordActivityTaskStarted\n\t/temporal/service/history/handler.go:333\ngo.temporal.io/server/api/historyservice/v1._HistoryService_RecordActivityTaskStarted_Handler.func1\n\t/temporal/api/historyservice/v1/service.pb.go:947\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor\n\t/temporal/common/rpc/grpc.go:100\ngo.temporal.io/server/api/historyservice/v1._HistoryService_RecordActivityTaskStarted_Handler\n\t/temporal/api/historyservice/v1/service.pb.go:949\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.33.2/server.go:1210\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.33.2/server.go:1533\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.33.2/server.go:871”}

The logic solution to this is to stop passing the 100kb as arg to the Activity. The issue is that in our use case, we really need to send the struct to the Activity. Another approach would be to compress the arg, but again if the history of events grows, we will have issues again.

@maxim glad to send the sample if needed. ( Can’t attach here )

Thanks again.

Pedro Almeida

maxim · December 7, 2020, 7:17pm

I’m confused about issue you are having. Are you saying that the large gRPC payload is not an issue anymore?

(500 signals + 500 activities) * 100kb = 100mb

which is indeed exceeds the history limit.

In your case I would recommend calling continue as new every 50-100 signals. This way the history size will be bounded and you can process an unlimited number of signals keeping the 100k argument size.

Topic		Replies	Views
Release plan for Increasing grpc message limit from 4MB default? Community Support java-sdk	9	4382	October 27, 2024
RESOURCE_EXHAUSTED: Received message larger than max (4853862 vs. 4194304) Community Support go-sdk	11	5972	December 27, 2020
About size of activity task queue messages Community Support	1	121	October 28, 2024
gRPC message size error getting swallowed up by retry middleware Community Support go-sdk	0	207	May 16, 2024
The workflow history in the UI throws error with max GRPC message limit Community Support java-sdk	6	1174	March 12, 2022

GRPC message size limit

Related topics