Get huge chunks of data from s3 / dynamo without using an activity to be used inside workflow for analysis

Inside a workflow, we have getData bringing in huge loads of data directly from s3 using a reference, we use this data and transform it to the final output payload, we save this payload using saveData sending it back directly to s3

currently getData and saveData are activity and the transformation is a piece of code that runs as a part of workflow

Is there a workaround to bypass the check for temporal’s determinism
to run these data functions not as an activity but as helper functions so we don’t pollute and fill the temporal cloud db with huge data coming in and out of activity

predefining these transform pieces of code is not possible

why not pass the S3 key, download the data inside the activity, and then save it back to s3? that way the only data getting persisted in the event log is the S3 key

I think you want something like samples-java/core/src/main/java/io/temporal/samples/fileprocessing at main · temporalio/samples-java · GitHub