Best practices for running Gemini Batch API calls via Temporal

LeoW · July 25, 2025, 4:22pm

Hi Everyone,

I am using temporal to query Gemini using the batch api with high loads that comes in bursts, i.e. often several thousand workflows/activities will be started simultaneously and then there are long times with near zero activity.

In order to limit the extraction to the Gemini Batch API rates of 100 concurrent batch requests I am wondering what the best approach is. Due to the requirement of concurrency of the workflow ‘MaxTaskQueueActivitiesPerSecond’ is not a workable solution from what I understand about it (might be wrong?). We currently have workflows that are started per user activity (in our case e.g. a document upload) and each workflow would then initialize different child workflows and activities that will call the Gemini (or other model provider) APIs. Based on this discussion
it seems that all Gemini API calls should in either case be done via activities.

I hence currently see 2 ways of implementing this:

Via putting all the batch requests into an activity and then limiting the concurrent number of activities (e.g. via a slot supplier
) in the worker. I can do this by scheduling the activities using a specific task queue and create a worker that listens on that task queue.
However, my concern would here be if this limit will be applied across all workers and hence if the API rate limits will be surpassed? My second question here is how I need to tune the different workers to ensure overall best performance (and direction would be helpful)?
In this instance would activities just be queued and I would need to ensure a sufficiently high timeout for each activity?
By using either a batch sliding window technique
or a semaphore workflow to monitor the number of active queries and globally coordinate this. This seems like the more complicated solution but perhaps more scalable, depending on how well the first works.

I wanted to ask for advise what implementation is generally advised here from others who have already more temporal experience.

Thank you ahead!

Topic		Replies	Views
Many workflows that consume a shared limited resource Community Support go-sdk	4	212	May 22, 2024
Temporal Dynamic Rate Limiting per Activity Community Support typescript-sdk	3	1663	February 19, 2024
Limit Concurrent Activities in Parallel Community Support go-sdk	21	9006	January 30, 2023
Temporal not respecting concurrency configuration Community Support go-sdk	3	1853	October 13, 2022
Rate limit activities Community Support general-impl	14	6989	March 8, 2023

Best practices for running Gemini Batch API calls via Temporal

Related topics