What is the definition of the metric task_attempt_bucket?

I understand how prometheus and bucket metrics work. However would like to ask that for a metric task_attempt_bucket{component=“history”,instance=“”,le=“0.001”,namespace=“all”,operation=“TimerActiveTaskDeleteHistoryEvent”} = 55,

what is the bucket of this metric
what is the unit of label “le”
and what is the unit of the metric value 55 in this case

Thanks and Regards,
Yu Yang

what is the bucket of this metric

its a histogram metric, the upper bound of the bucket you want to query for certain time series is controlled by your “le” (less than or equals) parameter.

what is the unit of label “le”

le is the “less than or equals” bucket label its upper inclusive bound for the bucket and can be between 0 and infinity (+Inf)

and what is the unit of the metric value 55 in this case

running count of task attempts that fall under the histogram bucket for this metric with upper bound of 0.001

Typically you would use this histogram metric with a rate function over set period of time, for example

sum by (namespace, operation) (rate(task_attempt_bucket{le="+Inf"}[15m]))

hi tihomir, many thanks for the reply. My question is about the semantic meaning of this metric. So would like to understand the unit of label “le”, in this case “le”=0.001, or le=“50”, what is the unit of the number 0.001 and 50, is it seconds, or count etc?

This histogram metric measures the task_attempt distribution

Measure of “le” is a “count”, so for a similar example:

task_attempt_bucket{namespace="default",operation="TimerActiveTaskActivityTimeout",le="1000"} 82

means that there are 82 records with task_attempt count <= 1000

if you want the total count you can get value from task_attempt_count which should be same value as in the task_attempt_bucket{..., le="+Inf"} bucket.

I believe it uses default prometheus bucket distribution, and for this particular metric, le < 1 would not make sense because attempts start at 1.

1 Like

Thanks very much for the quick reply.