What is the best way to exchange large amounts of data between Activities without running into "Complete result exceeds size limit" error?

I am trying to pass data between multiple Temporal Activities within a Workflow, using the go-sdk. The data being passed at the point the activity fails, if serialized to json, creates a json object that is ~2-5MB in size.

My Activity fails with the following NonRetryableFailure error

Complete result exceeds size limit.

Additionally, at the workflow level, the following error is also returned ScheduleActivityTaskCommandAttributes.Input exceeds size limit.

Upon reading Find cause of "Complete result exceeds size limit" error, this error looks to be the same/similar error - I may be hitting the 2MB BlobSizeLimitError error.

– A couple of questions here –

What is happening under the hood in Temporal, such that I would hit the 2MB BlobSizeLimit if I am passing in and returning a large object into the workflow.ExecuteActivity() function?

And secondly, what would you recommend that I do to resolve this error, if I need to pass relatively large amounts of data (~2MB to 20MB in size when serialized to json) between different Temporal Activities?

The code in the Temporal Workflow of the Activity that encounters the Complete result exceeds size limit. error is below.

var Res *QueryResponse
err := workflow.ExecuteActivity(ctx, RunQuery).Get(ctx, &Res)

What is happening under the hood in Temporal, such that I would hit the 2MB BlobSizeLimit if I am passing in and returning a large object into the workflow.ExecuteActivity() function?

Workflow is going to be blocked or activity is going to fail with non retryable error.

And secondly, what would you recommend that I do to resolve this error, if I need to pass relatively large amounts of data (~2MB to 20MB in size when serialized to json) between different Temporal Activities?

There are two approaches:

  1. Store large objects in some external blob store like S3 and pass references as inputs and outputs of activities.
  2. Cache large objects in the process memory or on local disk. Use Go SDK session feature to send activities to the same host that keeps the cache. In this case the workflow has to account for the situation when the host goes down and the whole sequence has to be redone on another host. See the fileprocessing sample that demonstrates this pattern.

I ran into a similar issue where I have to grab a fairly large dataset (~15mb) at the start of a workflow and I do that with a db query activity that I couldn’t pass the result back.
To work around this, I used redis to cache the data with a sensible TTL and I passed back the redis key to the workflow so that other activities could use the data.

Hope that helps :sunglasses:

1 Like

Thank you for this advice. I will try these items.