What benefits temporal gains using RPC over REST?
What was the dealbreaker?
The dealbreaker was the request multiplexing over a single connection that gRPC (and TChannel in case of Cadence) provides out of the box. As Temporal workers rely on long poll operations for listening on task queues multiplexing is essential to keep number of TCP connections under control.
But it doesn’t mean that REST cannot be added to Temporal. We consider using gRPC support for REST to expose some REST APIs.
Can you elaborate what exactly you’re asking about? Is this regarding the GRPC migration or the architecture choice in general.
Just to mention that RPC doesn’t really contradict with REST. I’m pretty sure you can implement REST on top of RPC.
Hey Maxim & Ryland,
Thanks for the quick reply. Its about general architecture choice. gRPC migration seems right direction TBH, although my personal knowledge and experience with rpc is limited to theory and some experiment. The reason I’ve asked is because rpc is difficult to troubleshoot in general. We reduced RPC usage due to regular segfaults, although it was PHP based gRPC.
Ok here’s the full story
We do have a lot of workflows in our ecosystem. At a first glance I’m tempted, I’d like to use it, If the features temporal promises are true in practice. Before I made the proposal for my team I want to understand how it works, what are the bottlenecks, and most importantly how to mitigate issues once it’s up and running. I’m planning to do some testing to address some of these questions. Some of them answered in Maxim’s and Samar’s architecture presentations but some parts still unclear to me. especially how the system works under the hood as I couldn’t find any in depth architecture doc. All the observations so far from temporal java examples.
Scenario #1
Let’s say I have a workflow with 5 activities (sequential). Workflow code and activity code is packaged in one application. Activities are very simple, fetching some data (few MB) from external storage and do some calculation, then save it back. I deployed(in k8s) the app with 3 workers and autoscale set up to 5 workers. Workflow is triggered by another application upon receiving a message from kafka.
- How these workers are getting the task?
- Do they listen to the temporal service? Or its triggered by temporal service?
- While the workers are running, are they keeping an open connection?
- What type of connection & how they are managed? Trying to find out when It’ll run out of memory/ports/cpu or crashes?
Now let’s say: My workflow is triggered 5 times per second. Each workflow takes 1(0.2s/activity) seconds to complete, considering happy flow. With initial setup 3 active workers will be able to process 3 request (out of 5) at 1st second. Ignoring few micro/milliseconds,
- What happens to other 2 requests? My understanding is, they will wait for a worker to be free from 3 running workers, right?
- How to use those 2 available but not running worker(autoscale: 5) instance ready to be auto scaled.
- It seems like activities (part of the workflow mentioned above) must be accompanied by workflow code, right? I’ve read that activities can be deployed independently, I guess thats different use case than this.
- I assume activity methods are triggered by temporal via rpc, also assume thats where it uses the long polling. Again trying to find out when It’ll run out of memory/ports/cpu or crashes? Correct me if I’m wrong.
- Worker registration, workflow & activity stub uses class literal (Classname.class), which also suggest that those needs to be accompanied. Same goes for child workflows. Correct me if I’m wrong and the solution if they need to be scaled independently.
Scenario #2
Similar workflow as before, except activity 3 waits for an external signal. To send a signal from external service, it needs to know the identity of the specific workflow, right? This seems like tight coupling even though it’s an asynchronous signal. Would be ideal if external system just send a message to a message bus/queue and workflow consumes it to proceed. Whats the recommended pattern in this case?
Some general questions & concerns,
-
I installed the temporal using the temporal helm chart. It comes with a worker instance, is this just a default worker? Or it’s different than the worker client implements, e.g file processing example worker
-
Scalability: my initial estimation is, 1000 different workflows excluding child workflows, each workflow can have ~15 activities, each workflow will handle in an average 1000 requests per second. Activities could be running from seconds to a few hours. How realistic it is with current state of the temporal?
-
Multi data center setup and communication, not much documentation available. Especially k8s production best practices.
-
Would like to see key differences with other solutions out there. I read ycombinator discussion around it, but would be nice to have a concrete comparison chart.
-
Any comment/suggestion on overall load estimation & impact on the cluster (I’m planning to deploy in a shared k8s cluster)
-
Is there a possibility to create some resources to deep dive on temporal architecture?
Finally
I understand that lot of these depends on how I design my overall Workflow, but I also want to see how feasible is it for naive approach! Unfortunately we don’t use cassandra and my proposal has to be solid enough to get approval, as at the moment thats the only scalable option temporal offers. Not sure MySQL can perform as cassandra.
Thanks
Make sure to not pass multi-megabyte payloads as activity arguments and results.
- How these workers are getting the task?
- Do they listen to the temporal service? Or its triggered by temporal service?
Workers listen on task queues using the long poll to the service gRPC interface. Activity workers call PollActivityTaskQueue and workflow workers PollWorfklowTaskQueue.
- While the workers are running, are they keeping an open connection?
I believe gRPC keeps an open connection from the worker to the service that multiple long polls multiplex on.
- What type of connection & how they are managed? Trying to find out when It’ll run out of memory/ports/cpu or crashes?
They are gRPC HTTP2 connections. So any gRPC troubleshooting techniques apply.
- What happens to the other 2 requests? My understanding is, they will wait for a worker to be free from 3 running workers, right?
Workers can process multiple requests in parallel. So for 5 requests per second, a single worker should be enough. It is possible to limit number of parallel requests that a worker can handle using configuration. So you can recreate your scenario by limiting each worker to one parallel activity request at a time. In this case requests (activity tasks) are going to accumulate in the task queue forming backlog. Then they either going to timeout or more worker capacity is added and the backlog is drained.
- How to use those 2 available but not running worker(autoscale: 5) instance ready to be auto scaled.
You can use ScheduelToStart (queue time) latency metric to make an autoscaling decision. When there is no backlog its value is very low. As backlog grows it increases.
- It seems like activities (part of the workflow mentioned above) must be accompanied by workflow code, right? I’ve read that activities can be deployed independently, I guess thats different use case than this.
Must is not really correct. The runtime topology is your choice. You can deploy all activities and workflows together as monolith. But you can also deploy each activity type as a separate pool of workers. You can even route requests to specific workers (which may be useful if some data is cached in the worker process).
- I assume activity methods are triggered by temporal via rpc, also assume that’s where it uses the long polling. Again trying to find out when It’ll run out of memory/ports/cpu or crashes? Correct me if I’m wrong.
Activity methods are not triggered via rpc. The worker gets activity tasks by long polling the service endpoint and then invoking an appropriate activity handler when it gets a task. The main reason for cpu/memory issues is having resource-intensive activities and not configuring rate and parallel execution limits correctly.
- Worker registration, workflow & activity stub uses class literal (Classname.class), which also suggest that those needs to be accompanied. Same goes for child workflows. Correct me if I’m wrong and the solution if they need to be scaled independently.
I’m not sure how compile time dependency relates to scaling. When strongly typed API is used to invoke an activity or a child workflow the only compile time dependency is an interface that activity implements. The implementation doesn’t need to be available to the process that hosts the calling workflow.
If you don’t want to have a compile-time dependency at all it is possible to call activities and child workflows by their string name using an untyped interface.
The identity of a specific workflow is expected to be a business level identifier that is assigned when workflow is started. So it shouldn’t be a problem to construct this identifier from a Kafka message.
We do have long term plans to add native integration with Kafka and other messaging systems as well as add one to many notifications to the Temporal.
Some general questions & concerns,
- I installed the temporal using the temporal helm chart. It comes with a worker instance, is this just a default worker? Or it’s different than the worker client implements, e.g file processing example worker
This is an internal Temporal component that implements various system level workflows (we eat our own dog food). Your workers are not related to it in any way.
- Scalability: my initial estimation is, 1000 different workflows excluding child workflows, each workflow can have ~15 activities, each workflow will handle in an average 1000 requests per second. Activities could be running from seconds to a few hours. How realistic it is with current state of the temporal?
Do you mean 1000 workflow types? Each type started 1000 times per second? Are you asking for 1 million workflow starts per second with 15 million activity executions per second?
While I believe it should be possible to create a Temporal based infrastructure to support such rates it is far beyond anything we tested.
- Multi data center setup and communication, not much documentation available. Especially k8s production best practices.
Agree. We are working on the documentation.
- Would like to see key differences with other solutions out there. I read ycombinator discussion around it, but would be nice to have a concrete comparison chart.
Agree. Working on it.
- Any comment/suggestion on overall load estimation & impact on the cluster (I’m planning to deploy in a shared k8s cluster)
The bottleneck is usually a DB. As each use case is different we recommend running your own load tests to find out the right configuration.
- Is there a possibility to create some resources to deep dive on temporal architecture?
Yes, we are going to create them.
If any of the documentation you promise are now available, please share the links.
Or if you could link to a ticket tracking progress, that would be just as helpful. It is not always clear to future readers whether they should start looking for the new documentation, or whether they should keep waiting.
Hey Gordon, sorry for the late response. We track this stuff internally and there is not really a convenient or practical way to share that. I would say that our docs should be considered the best source of truth and the search functionality works pretty well. We plan to continuously evolve and improve the docs and will try to be very communicative about changes we make. That being said, I completely understand your feedback/need and will keep it in mind.