Hey Temporal community,
I’m currently working on a workflow that involves performing HTTP requests to multiple websites, each with its own rate limit. I’m facing a challenge in implementing a rate limiter service to ensure that the requests are made within the allowed limits, even in the presence of retries. Here’s a simplified representation of my workflow code:
workflow.defn: activity1() activity2() send_http_request_to_site_activity('example.com') activity_foo() activity_bar()
During the execution of the
send_http_request_to_site_activity activity, I need to check with the rate limiter service if it’s okay to proceed or if I should wait. However, I want to avoid blocking the worker since there could be a large number of requests in the execution queue. Additionally, I need to ensure that the rate limiter is informed when the request is completed, so it can allow the next HTTP request to be executed in turn.
I have more than 20,000 different websites to which I need to send HTTP requests, each with its own rate limit (also, I use the Python SDK)
I’ve attempted three approaches, but encountered some issues:
Using Asynchronous Activity Completion: In this approach, I modified the workflow code as follows:
workflow.defn: activity1() activity2() wait_for_rate_limiter_access('example.com') // This activity uses Asynchronous Activity Completion and waits until the rate limiter service allows access to continue. send_http_request_to_site_activity('example.com') notify_rate_limiter_request_completion('example.com') // Notifies the rate limiter that the request is completed. activity_foo() activity_bar()
The problem with this approach is that during retries, if the
wait_for_rate_limiter_accessactivity will not be triggered again before the subsequent attempts. Additionally, if there is a failure in the
notify_rate_limiter_request_completionactivity will not be executed.
Using Signals:: I explored using signals to address the rate limiter interaction. However, I faced similar issues as in the first approach.
Using an HTTP Proxy: As an alternative, I also attempted to utilize an HTTP proxy that blocks requests according to the rate limit per domain. However, this approach introduced new challenges. The proxy effectively blocked the worker, making it difficult for the rate limiter service to efficiently handle a large number of pending workflows (e.g., more than 80,000) (for example I can’t send more than 1 request per second to each domain, but I can send more than 1000 requests per second for different domains. so by blocking the worker with an HTTP proxy, the proxy server, or in other words, the rate limiter service will in-queue a limited number of requests according to the number of worker processes. now consider with 100 processes we only enqueue 100 requests for the rate limiter service and if all of them are from a single domain, we face a large blocking time since the rate limiter service does not know about other requests from other domains that waiting to run). Additionally, since the proxy centralized the sending of HTTP requests to a single server, bandwidth limits and distribution across multiple worker servers became problematic.
I would greatly appreciate your insights and suggestions on how to overcome these challenges and effectively implement a rate limiter service for my workflow activities.