What metrics does temporal expose out of box and how to consume this in prometheus?

madhu · August 3, 2020, 11:17am

Hi, there is not much documentation around metrics, the statd which was removed too seemed very complex.

I want to to understand
a)what metrics does temporal expose by default
b) are the metrics namespace specific?
c) can i get queue /task list specific metrics?
d) how to consume them in prometheus.
e) if i am to develop custom metrics what’s the best way, should those be activities in workflows or interceptors?

samar · August 4, 2020, 5:59pm

Hey @madhu,
Yes this is an area where we lack any public documentation at the moment but this is definitely something which is pretty high up in priority among the list of tasks which we plan to address soon.

Temporal server reports a wide variety of metric to help operators get visibility into cluster and setup alerts. We use tally for reporting metric from the application and it supports multiple backends like prometheus, statsd, and M3db. We generally recommend to run Temporal with Prometheus backend and plan to provide dashboards using promQL to the community very soon. Here is a dashboard repo which we started recently. This is something we are iterating over pretty heavily at the moment and not ready for production use at the moment, but you can definitely use this as a reference to build your own dashboards.

All the metric emitted by server are listed in defs.go. So if you see somethings are missing in the dashboards then you can use the defs.go as a reference.

We have provided a development config which shows how to run the server using prometheus as the back end. You can also checkout our helm chart which also has a section on how to run Temporal with prometheus as the metric backend.

madhu · August 4, 2020, 6:40pm

Thanks much @samar i will check these links and get back. really appreciate .

Sandeep_Paul · September 7, 2020, 4:05pm

Hi @samar ,
“with the context of .29 helm charts”
in the config we have
datasources:
- name: TemporalMetrics
type: prometheus
url: http://{{ .Release.Name }}-prometheus-server
access: proxy
isDefault: true
“url: http://{{ .Release.Name }}-prometheus-server” what does this URL stand for and being resolved.

why am I asking this is, I have installed the temporal helm charts tags .29 in X namespace and we have prometheus operator and grafana in another Y namespace.
Im not getting the datasources in the grafana UI.

please let me know if Iam missing something or how should I proceed to use existing prometheus operator.

regards
Sandeep

madhu · September 7, 2020, 6:21pm

basically your question is how to configure “bring your own prometheus”
i am not too sure, but each component/Role has a promethus section right, will you not be able to provide the prmoethus endpoint there?

  frontend:
    # replicaCount: 1
    service:
      type: ClusterIP
      port: 7233
    metrics:
      annotations:
        enabled: true
      serviceMonitor: {}
       prometheus: {//HERE GOES YOUR STUFF??}
 
      # enabled: false

derek · September 9, 2020, 2:56am

Hi @Sandeep_Paul - those configs that you are talking about are for connecting the grafana deployed by our helm chart to the prometheus deployed by our helm chart.

If you want to bring your own prometheus and your own grafana you can make use of our dashboards but you’ll use your existing prometheus as datasource and configure your prometheus to scrape metrics in the namespace you have temporal deployed to.

Depending on how you have that set up, things to check include: prometheus is configured to scrape metrics in your temporal namespace, your RBAC settings allow prometheus to scrape metrics in your temporal namespace, and that any annotations you need on temporal deployments are set.

With prometheus operator there’s a note in our default values file about setting serviceMonitor to enabled which should create the required resources for you.

when using your own prom/grafana, install the helm chart with prometheus.enabled=false and grafana.enabled=false as well.

sonyantony · September 29, 2021, 11:20pm

Is Prometheus scraping points enabled in both docker-compose and helm charts by default on the Temporal server ?
If yes, what is the URL for scraping ?
If No, how can we enable it ?

tihomir · September 30, 2021, 3:18pm

Prometheus is not set up by default in the docker compose files.
To add:

Enable it in temporal->environment by adding:

- PROMETHEUS_ENDPOINT=0.0.0.0:8000

Add port 8000 in temporal->ports by adding:

- 8000:8000

Check the metrics on:

http://localhost:8000/metrics

sonyantony · October 3, 2021, 4:48am

Thank You Tihomir. It worked. !!
( I had tried this using a port 9090 before I posted teh question here. Somehow - I need to check why - it did not work )

whitecrow · August 5, 2022, 2:02am

Hi @samar ,
Is there any update about the metric documentation?

tihomir · August 5, 2022, 2:58am

You can see SDK metrics docs here. Server metrics docs are afaik still in the works and should be available in near future.

Topic		Replies	Views
Metrics For Monitoring Server Performance Community Support performance , metrics	2	4054	August 27, 2020
Use new metrics system Community Support prometheus , metrics	2	691	July 11, 2022
How to observe the temporal cluster? Server Deployment helm , cassandra , metrics	4	3328	February 1, 2023
Guidance on creating and interpreting Grafana dashboards Community Support prometheus , metrics	3	5654	June 28, 2024
Metrics Monitoring Changes from Cadence -> Temporal Community Support	1	1097	April 8, 2021

What metrics does temporal expose out of box and how to consume this in prometheus?

Related topics