Dynamic rate limiting

jbendotnet · October 2, 2021, 8:18pm

I’m designing a web crawler, and one feature I’d like to implement is a per domain rate limit, that’s separate from the existing workflow or activity rate limit and configurable dynamically.

I realise I could deploy a new worker for the domain however as a single worker will be able to make many thousands of outbound HTTP requests, I’m looking for alternatives to this.

In the past I’ve used a throttler via middleware within the HTTP transport to rate limit per-domain, perhaps this could be implemented via WorkflowInterceptors?

Is there an ideal “Temporal” approach here? I think I’m looking for a per-domain task queue rate limit but I’m not sure if it’s possible or how to achieve it using the Go SDK.

Thanks, Jon

maxim · October 2, 2021, 8:29pm

Out of the box, Temporal provides rate limiting of a task queue. If a specific activity instance needs to be throttled it should listen on a separate task queue.

I’m not sure if task queue rate-limiting fits your use case. The missing information is how many domains your crawler is expected to process. If the number of domains is bounded to something like 1k then the task queue way is the best one. If you need to process millions of web domains then having millions of activity workers would be overkill.

For a very large number of domains, I would make each activity throttle ifs requests and ensure that there is a limited number of activities per domain through workflow code. I would consider having a workflow execution (instance) per domain in this case.

jbendotnet · October 2, 2021, 9:23pm

Hi Maxim, to clarify my requirements, I need to crawl a low (eg below your 1k bound) number of domains, and domains need to be determined at runtime.

Could I use a rate limiter such as GitHub - uber-go/ratelimit: A Golang blocking leaky-bucket rate limit implementation to block in the main per-domain workflow (the spider or crawler), without breaking workflow determinism?

This could ensure that only a maximum of child workflows are started without requiring changes to the deployment of workflow executors and Temporal’s task queue rate limit will ensure that the true maximum number of activities (eg outbound HTTP requests) can’t be exceeded.

Will this cause issues with replay? I’m assuming Temporal’s SDK still calls the activities (returning from the event store data) so the rate limit would apply during replaying.

Could “continue as new” help here? Limiting the impact of replaying to the number of events in that invocation of the workflow?
I’ll need to use “continue as new” to avoid going over the 50k events limit as domains can have many more URLs than 50k.

Thanks, Jon.

jbendotnet · October 2, 2021, 10:28pm

I think yes, more reading required.

maxim · October 2, 2021, 10:41pm

Could I use a rate limiter such as GitHub - uber-go/ratelimit: A Golang blocking leaky-bucket rate limit implementation to block in the main per-domain workflow (the spider or crawler), without breaking workflow determinism ?

I don’t think it is going to work as such library probably uses system time and native Go gorotine sleep instead of workflow.Sleep.

I need to crawl a low (eg below your 1k bound) number of domains, and domains need to be determined at runtime.

Then I would create a workflow per domain and let it start a worker that polls on the task queue just for that domain from an activity. That worker would use task queue rate limiting. The actual crawl activity invocation can be done from a child workflow that would call continue as new periodically.

jbendotnet · October 4, 2021, 4:54pm

Understood, I think I could cheat and use workflow.SideEffect to wrap the calls to the limiter, a crude approach which should not impact replay.

I like this approach, it’s more complex in a Gitops deployment model (which we have), we can automate the Git operations from within a workflow and then polling to check that the new worker is available before continuing.

Thanks for your help as ever

maxim · October 4, 2021, 4:56pm

Understood, I think I could cheat and use workflow.SideEffect to wrap the calls to the limiter, a crude approach which should not impact replay.

Don’t do this as it is going to trip the deadlock detector.

I like this approach, it’s more complex in a Gitops deployment model (which we have), we can automate the Git operations from within a workflow and then polling to check that the new worker is available before continuing.

When I say “start a worker per domain” I don’t mean that you have to start a worker process per domain. You can have many workers in a single process.

jbendotnet · October 4, 2021, 7:57pm

Understood.

Interesting, FSR I thought it wasn’t desired to have > 1 worker process per process.

To clarify, do you mean It’s safe to start a Worker instance from within a Workflow, calling wf.Run() with workflow.Context to handle interrupt, with the required activities registered with it at runtime?

maxim · October 4, 2021, 8:08pm

To clarify, do you mean It’s safe to start a Worker instance from within a Workflow, calling wf.Run() with workflow.Context to handle interrupt, with the required activities registered with it at runtime?

No, as worker implementation is not “workflow safe”. You wan to start a worker from an activity instead.

jbendotnet · October 4, 2021, 8:38pm

Interesting, this is extremely flexible.

My assumptions:

The activity should be called asynchronously?
It’s the responsibility of the workflow that calls the activity to signal to the worker inside the activity to stop (during cancellation or when work is complete)?
The activity should send heartbeats to indicate to Temporal that it is running and not stuck/deadlocked?
That Temporal will handle restarting the worker etc if the activity is rescheduled/redeployed elsewhere?

maxim · October 4, 2021, 9:01pm

Temporal Go SDK always calls all activities asynchronously as ExecuteActivity call returns a Future.
I would recommend to use Session feature to route multiple activities to the same process. This way you can have an activity to start a worker and another activity to stop it.
The session itself is already heartbeating internally, so you can get notification if the process is down through the session context.
If you use a session then the workflow has to recreate the session.

jbendotnet · October 4, 2021, 9:58pm

Thanks @maxim, that’s all clear and really helpful.

Topic		Replies	Views
Rate limit by specific param value Server Deployment	1	528	May 4, 2023
Rate limiting worker Community Support python-sdk	6	63	December 16, 2024
Temporal Dynamic Rate Limiting per Activity Community Support typescript-sdk	3	1476	February 19, 2024
Rate limit per activity Community Support java-sdk	11	2455	January 15, 2024
Rate limit activities Community Support general-impl	14	6862	March 8, 2023

Dynamic rate limiting

Related topics