Tuning your cluster, persistence, and your application workers (processes that execute your workflow code) is very important thing to do.
Typically you would stand up a cluster, deploy your workers and then load test to measure your overall performance against your defined criteria. The results of your load testing then can result in updates on server side (static/dynamic config, number of server roles used, number of history shards defined, etc), on the persistence side it could involve scaling if necessary, on your application side could be scaling / tuning your workers etc). So this is an iterative process you would have to go through when self-deploying.
To help you fine tune things Temporal server emits a ton of metrics. First thing would be through config to expose those server metrics (let us know how you are deploying server so could provide more details).
You can again through configuration set the metrics format (prometheus or statsd) and then would need to stand up lets say Prometheus to scrape the server metrics endpoint(s) and also for example Grafana to view and alert on these metrics.
On the SDK side you would also need to enable SDK metrics (see Go sample here) and update your Prometheus scrape config accordingly and then run queries in Grafana for those as well. Docs do have a worker tuning guide that you could reference when you get there and there is a nice post about task poller count here thats helpful.
Another thing to do would be to monitor your deployment memory and cpu utilization like you would with any of your prod applications.
Hope this puts you in a right direction, let us know if you have more specific questions.