Activity throws error `tsync_worker_receive_frame: EOF` in Docker setup

cristianocz · November 24, 2025, 8:01pm

I recently experimented converting high workload queued jobs to Temporal workflows and activities.

One in particular involves processing very large CSV files by splitting them in multiple activities.

The issue I have on my end is that the activity eventually re-attempts but I am unsure what the issue could be.

I have posted the error below but if my colleague runs the exact same code the file processed correctly with all the activities never retrying in the first place.

Same amount of workers (1) and num_workers: 10 in .rr.yaml.

We both run the same Docker setup using docker compose which describes the containers necessary to run the application.

Has anyone hit a similar issue? Seems like something to do with my setup, any help would be very much appreciated.

2025-11-24T19:45:46+0000	ERROR	temporal    	Activity error.	{“Namespace”: “default”, “TaskQueue”: “import_create_contacts”, “WorkerID”: “roadrunner:import_create_contacts:4cd9cfb3-050f-4eb4-ba4d-1d791423b919”, “WorkflowID”: “import-contacts-14-1764013389”, “RunID”: “019ab764-366b-7651-9493-0626a82d7dd7”, “ActivityType”: “ProcessCSVChunk”, “Attempt”: 1, “Error”: “activity_pool_execute_activity: Network:\n\tsync_worker_receive_frame: EOF”}

tihomir · November 25, 2025, 1:50am

Not sure this is Temporal specific, looks maybe like resource utilization issue on your worker when processing large file, maybe running out of memory/disc space?

One in particular involves processing very large CSV files by splitting them in multiple activities.

can you give more defaults on how your have implemented this use case?

cristianocz · November 25, 2025, 5:22pm

Thanks very much @tihomir, appreciate the help.

There’s something that is not working correctly on my machine specifically, the other machine the code is tested on has less resources than the one I am working on.

With 64GB RAM / 4TB disk (25GB RAM / 200GB disk to Docker):

Current usage: ~8GB RAM, ~1% CPU, 89GB storage

There’s a chance it is not Temporal specific the CSV is essentially split into chunks and indexes are kept for which rows to process.

As for the process it is the same as it was with the other queue system we were using, except with that queue system this process would complete and process the file.

We don’t actually split the file itself. Instead, we create lightweight “chunk descriptors” that tell each activity which rows to process from the original file.

Step-by-Step Process

Initialization (happens once):

Read the CSV file to count total rows and identify any rows to skip
Create chunk metadata - just row numbers, not actual data
Example: Chunk 1 processes rows 1-1000, Chunk 2 processes rows 1001-2000, etc.
We use 1000 rows per chunk for standard imports
A 40,000 row CSV creates 40 chunk descriptors

Parallel Execution (via Temporal Workflow):

Temporal starts all chunk activities simultaneously
Each activity runs independently on available workers
Temporal handles the distribution and load balancing

Per-Activity Processing:

Each activity opens its own stream to the CSV file from storage
Uses an iterator to skip to its assigned row range (e.g., rows 5001-6000)
Processes only those rows, then closes the stream
Marks each row as “completed” in Redis for crash recovery

Fault Tolerance:

If an activity fails, Temporal retries it automatically
On retry, we check Redis to skip already-processed rows (prevents duplicate contacts)
This allows us to resume exactly where we left off

Key Point: Memory Efficiency

Each chunk descriptor is tiny (just a few integers):

Chunk 15: {start: 15001, end: 16000, run_id: 123, ignored: [15045, 15891]}

We’re not copying the CSV data 40 times. Each activity streams directly from the original file and only reads its assigned rows.

I am reviewing my setup again but since we use Docker and the same OS it should be essentially the same I believe.

It looks a lot more like it’s an issue on my end and not the code but I am unsure what, if I find the answer I will post it here in case anyone runs into a similar issue.

Topic		Replies	Views
Problems emitting 10,000 activity tasks in a workflow Community Support	3	573	October 9, 2020
Temporal Activity get stuck, without performing any action Community Support activity , typescript-sdk	7	116	September 19, 2025
Activity execution retried several times Community Support java-sdk	2	1251	April 9, 2021
No Workers Running Please make sure you have at least one worker connected to the nextwave-queue Task Queue Community Support python-sdk , activity	2	540	March 18, 2025
Activity scheduled but not started (need help) Community Support go-sdk	22	5644	June 27, 2022

Activity throws error `tsync_worker_receive_frame: EOF` in Docker setup

Step-by-Step Process

Key Point: Memory Efficiency

Related topics