I have an understanding that I will have a lot of tasks in the service, how to properly run the code to process them?
Is this a normal cycle with execute activity?
for example, let there be 10,000 jobs at the moment, how can I understand that the system does not pull with this thread of jobs and requires an additional cluster?
where these metrics can be viewed, measured, and correctly run the processing of the list of tasks?
Tuning your cluster, persistence, and your application workers (processes that execute your workflow code) is very important thing to do.
Typically you would stand up a cluster, deploy your workers and then load test to measure your overall performance against your defined criteria. The results of your load testing then can result in updates on server side (static/dynamic config, number of server roles used, number of history shards defined, etc), on the persistence side it could involve scaling if necessary, on your application side could be scaling / tuning your workers etc). So this is an iterative process you would have to go through when self-deploying.
To help you fine tune things Temporal server emits a ton of metrics. First thing would be through config to expose those server metrics (let us know how you are deploying server so could provide more details).
You can again through configuration set the metrics format (prometheus or statsd) and then would need to stand up lets say Prometheus to scrape the server metrics endpoint(s) and also for example Grafana to view and alert on these metrics.
On the SDK side you would also need to enable SDK metrics (see Go sample here) and update your Prometheus scrape config accordingly and then run queries in Grafana for those as well. Docs do have a worker tuning guide that you could reference when you get there and there is a nice post about task poller count here thats helpful.
Another thing to do would be to monitor your deployment memory and cpu utilization like you would with any of your prod applications.
Hope this puts you in a right direction, let us know if you have more specific questions.
Thank You!
I am in the early stages of learning and practicing, deployed a docker image on a local machine, and would like to gain experience before deploying a cluster,
is it possible to do everything the same as you described in terms of metrics on a local machine with docker installed temporal?