How to detect a cron run timeout and alert?

Gordon_Sun · November 12, 2020, 11:37pm

I followed the example here https://github.com/temporalio/samples-java/blob/master/src/main/java/io/temporal/samples/hello/HelloCron.java
The cron runs every minute. I modified the execution and run timeout to this
.setWorkflowExecutionTimeout(Duration.ofMinutes(5))
.setWorkflowRunTimeout(Duration.ofSeconds(10))
I let it run for 2 minutes and then killed the worker and here is what I got

How can I detect that the run timeout happened and raise some alert? And what if I want to alert only when 2 consecutive run failed/timed out?

Thank you in advance!

samar · November 13, 2020, 12:24am

We emit a workflow_timeout metric tagged with namespace. Unfortunately this is not hooked up for cron workflows yet. Issue#397 tracks this improvement. Once it is fixed you should be able to use this metric to track cron workflows timing out.

Gordon_Sun · November 13, 2020, 12:28am

Thank you very much samar
Is there a doc around how to use metric to setup monitoring and alerts?
Or I should just learn one of the systems listed here? https://docs.temporal.io/docs/configure-temporal-server/#metrics

samar · November 13, 2020, 5:15pm

Lots of users are running Temporal with prometheus as metric backend. Please refer to our helm-charts for documentation on setting it up. If you search on forum you will find a lot of information on this topic. Few examples:

Gordon_Sun · November 13, 2020, 6:19pm

We are using Datadog. Any support on that.
Google search shows nothing and search datadog in the forum got nothing.
I’ll ask a separated question

Topic		Replies	Views
Looking for certain metrics to alarms on Community Support	3	1978	October 31, 2020
Individual workflow metric Community Support metrics	4	1468	March 30, 2022
How to get the workflow execution time from external system using java sdk to use it for Metrics Community Support java-sdk , helm , metrics	1	1663	August 14, 2020
Troubles shoot of workflow execution latency Community Support performance	1	884	August 11, 2022
WorkflowTaskTimedOut observed during performance testing Community Support python-sdk	4	466	December 1, 2023

How to detect a cron run timeout and alert?

Related topics