I am new to Temporal. In my POC, I’ve deployed Temporal cluster on my Kubernetes cluster using helm charts from https://github.com/temporalio/helm-charts.git. Also, I have enabled scrape endpoints from my client code(both starter and worker in java),
Is there a way to categorize between metrics generated from Temporal server and Client? Also, is there a list of all the available metrics along with a brief description to indicate what the metric indicates?
@tihomir Is there any one liner description for these metrics? Need to understand which of them might be useful for us to monitor as the list is huge. Thanks.
That’s something that we should be adding to our docs soon.
To get a feel of which ones might be useful for you (including grafana queries) see this sdk and server dashboard defs.
This dashboard is also set up out of the box in the background checks learning path demo.
Thanks for letting us know that this is coming to the docs soon. Several of my peers have been chatting about this and hunting for metrics meaning in the code, etc. I think having a simple page listing all metrics and their meanings (and units) on the docs would be WONDERFUL. Please take this as a gentle “bump” that many Temporal users would love this documentation! (We can find what we need by hunting in the code today, but it is always a hunt)
@dpincas totally agree, team is looking into that and making sure the SDKs are aligned as well.
Also looking into providing better out-of-box dashboards for SDK and server metrics as well.
Will make sure to post updates here.