Hi!
We are moving into production with temporal and everything that comes with it. We are using running a hybrid cloud solution with Nomad and Consul. The whole temporal stack will be running on AWS.
Initially we are moving the current solution over to Temporal which means we need to spawn ~2.6 million workflows to set everything up. We can do this over the period of 1-2 weeks.
Expected Load
We have a plan to scale up the usage of Workflows and most new development will be spawning a lot of new workflows. But for the coming year the we have the current expected load
The CPU load is not defined in the data because Nomad only sets a “minimum” CPU needed to run the allocation. Memory looks to be the bottle neck and not the CPU.
The AWS servers we are using has the following specs:
Memory: 62,851 MiB,
CPU: 49,600 MHz
The current load in the AWS Cluster:
102.41 GiB of 185.45 GiB,
23.25 GHz of 148.8 GHz
Totals
Long Workflows per month | 15 000 |
---|---|
Short Activation Workflows per month | 76 800 |
Short other workflows per month | 15 750 |
Total New workflows per month | 107 550 |
New workflows per minute | ~ 2.49 |
Secondary actions, deactivations, shelves, etc
Action | Workflows / action | Actions / month | Workflows / month |
---|---|---|---|
Suspend | 5 | 230 | 1 150 |
Resume | 5 | 160 | 800 |
Terminate | 5 | 2 760 | 13 800 |
Total | 15 750 |
The current infrastructure
Cassandra
Node | Memory | CPU | Storage | Server |
---|---|---|---|---|
node1 | 8G, 6gig heap, 2gig newsize | - | AWS Elastic | AWS XL |
node2 | 8G, 6gig heap, 2gig newsize | - | AWS Elastic | AWS XL |
node3 | 8G, 6gig heap, 2gig newsize | - | AWS Elastic | AWS XL |
Elastic
Node | Memory | CPU | Storage | Server |
---|---|---|---|---|
node1 | 8G, “-Xms6144m -Xmx6144m” | - | AWS Elastic | AWS XL |
node2 | 8G, “-Xms6144m -Xmx6144m” | - | AWS Elastic | AWS XL |
node3 | 8G, “-Xms6144m -Xmx6144m” | - | AWS Elastic | AWS XL |
Temporal
| Service | Replicas | Memory | CPU | Server |
|–|–|–|–| --| --|
| frontend | 2 | 512M | - | any |
| history | 2 | 1536M | - | any |
| matching | 2 | 512M | - | any |
| worker | 2 | 256M | - | any |
| web | 1 | 256M | - | any |
Do you have any specific recommendations for a setup like this?