The workflow history in the UI throws error with max GRPC message limit

I’m getting error with history size in the temporal UI.

8 RESOURCE_EXHAUSTED: Received message larger than max (12224559 vs. 4194304). method: getHistory, req: {"namespace":"default","execution":{"workflowId":"376cbe08-7d0c-419f-b1c9-3a069bb4df22","runId":"9eb2ead2-69bb-4e8d-986a-0865d941bd18"},"waitForNewEvent":true

A little bit about my workflow implemention:
The workflow doesn’t have big event size (it’s about 200 events) but it consits four steps which are starting child worfklow. Each child workflow performs some actions for batch of applications and has around 9-20 thousands of events.

I have two questions regarding described issue:

  1. I’m wondering why the error occurs for the parent workflow and children workflows with bigger event size do not have the issue. Does the parent workflow keep somehow information for all children?
  2. Does the history size issue could cause any problems with workflow behaviour eg. executing task, throwing an error, stops the flow or it’s just issue with presenting history in the UI?

Do you get same error when running

tctl workflow observe -wid 376cbe08-7d0c-419f-b1c9-3a069bb4df22 -rid 9eb2ead2-69bb-4e8d-986a-0865d941bd18

Temporal keeps a workflow history for your executions and there is a 50K limit (see " Is there a limit to how long Workflows can run?" section in docs here). It’s recommended not to create very large histories and yes that could possibly lead to some performance issues. If the 50K event limit is reached your workflow execution will get terminated as well. For this it’s often needed to use ContinueAsNew in your workflow code.

  1. Child invocation events (StartChildWorkflowExecutionInitiated, ChildWorkflowExecutionStarted, ChildWorkflowExecutionCompleted and possible Failed/Timeout) are recorded in your workflow history.
    How many child workflows are you invoking in your workflow code?
  2. With very large histories you could run into performance issues with your worker(s)

@tihomir thank you for the quick answer
I don’t get error running the tctl workflow observe command. I’m also getting response without error runnig tctl workflow show for the workflow.
I know about the 50K limit for events and that’s why we divided the execution into child workflows.
The main workflow invoking 4 child workflows and have only 200 events so I found it suprising that the UI can’t get its history due to max GRPC message limit.

The main workflow invoking 4 child workflows

What is the size of the inputs to your workflow / child workflows / signals?

1 Like

a great chance for such error to occur is when the payloads (inputs, outputs, markers etc) of events are large in size and a single response to get history events is larger than gRPC’s default 4mb limit for incoming messages. This error is unrelated to the number of total history events, as this is a gRPC limit per incoming message (ex. response has 100 events that are total > 4mb is size)

You can increase the max by setting TEMPORAL_GRPC_MAX_MESSAGE_LENGTH, see GitHub - temporalio/web: Temporal Web UI . Please keep in mind that in this cases the UI may start working slower because of bigger networking loads

1 Like

Thank you for your answers. In my case, it seems that the large history size is actually due to the large inputs size to the child workflows and their results.

1 Like

This topic was automatically closed after 2 days. New replies are no longer allowed.