Of the options given, I think option 1 makes the most sense.
But you’ve already reserved your slot for this call for the wait_for_rate_limiter_access
. However, if you’re saying you don’t want to keep that slot reserved during retry backoff, you need custom retry logic which means you can’t really use the Temporal retry, so you can set the max_attempts
to 1
in the retry_policy
of the call and handle retries manually, doing what you need to do between each retry. Note, if you’re making manual retry which uses a loop, each activity call is more history (not the case with the less flexible built in Temporal retry).
You can put this in a finally
. Just make sure that you don’t swallow the outer exception by raising from inside the finally. Also you should know that if a workflow is terminated the workflow code won’t get called anymore, so notify_rate_limiter_request_completion
may never get called after wait_for_rate_limiter_access
does. Termination occurs via manual call or workflow timeout (which are not set by default), so it is avoidable.
How long do you expect wait_for_rate_limiter_access
to reasonably take? Putting the rate limiting near where the actual execution is happening (i.e. at the top of the activity) has value. What do you mean by “blocking the worker”? The worker will be blocked if max_concurrent_activities
is reached, but it is only a memory concern to have thousands of async def
activities yielded. Just set that number to as high of a backlog you’re willing to have. Granted you will want to heartbeat while waiting which can have a cloud cost. You could even have a task queue per thing you’re rate limiting on (seems like domain here) and have max concurrent only be as high as the likely rate limit. Granted max-concurrent is per worker and you may have many workers for high availability. But still, having max concurrent be the amount of in-flight activities and somewhere near or not too far above your rate limit should be fine.
But option 1 also makes sense for slower rate limits.