Fanning out many small activities in parallel issues

arcrank · October 2, 2024, 5:13am

We have temporal as an orchestration sort of platform for running workflows
that consist of many small activities (i.e. we do not have long running activities or cpu intensive ones if that matters).

In one stage of workflow, we receive a bunch of URLs that we then run a scraping activity against.
In order to speed up the execution, I used asyincio. gather over an async function roughly, only ever doing ~20 total.

async def process_foobar(input):
    ...
   res = await workflow.execute_activity(func, input, start_to_close, retry)
   return res

I then would fan out in my workflow using asyncio.gather.

However, I had unreliable results and have been unable to root cause or debug.
The activities never seem to get started, or timeout in something excessive of how long I have
configured them to timeout.

I have a large number of concurrent activities etc and played around with those settings however I am wondering if there is just a prescriptive method for doing this.

Is there something around the activities being scheduled but somehow not being picked up and then timing out even if they haven’t actually been allocated/executing for a time.

I am okay for resilient failures here, I even would prefer hedging as scraping can be unreliable, any insight would be helpful

awwx · October 3, 2024, 12:08am

I suggest posting the code that isn’t working; it’s hard to guess what the problem might be.

sunyizhe · December 18, 2024, 4:37pm

I’ve encountered similar issues before. In my case, it was because I accidentally used blocking APIs in async functions, causing the Python event loop to get stuck.

If you create 20 activities with asyncio.gather, all of them should start almostly immediately, even if you’re only running a single worker. If some activities are not getting scheduled, there’s a good chance your event loop is stuck. Try asyncio debug mode to see if there’s any issue.

Topic		Replies	Views
Maximum Concurrent Activities Within a Workflow Community Support python-sdk , activity , server	1	1484	August 22, 2024
Disambiguating concurrent workflows called via aysncio.gather() Community Support python-sdk	2	564	July 25, 2023
Understanding Slow Activity Execution and Timeouts Despite Low Worker CPU Usage Community Support python-sdk , activity	2	58	June 10, 2025
Understanding Workers with long-running Activities, and avoiding WorkflowTaskTimedOut on startToClose Community Support python-sdk	4	1172	March 13, 2025
Many activity Schedule to start errors Community Support python-sdk , helm , activity , postgresql	3	41	March 10, 2025

Fanning out many small activities in parallel issues

Related topics