I understand how prometheus and bucket metrics work. However would like to ask that for a metric task_attempt_bucket{component=“history”,instance=“100.127.55.42:8000”,le=“0.001”,namespace=“all”,operation=“TimerActiveTaskDeleteHistoryEvent”} = 55,
what is the bucket of this metric
what is the unit of label “le”
and what is the unit of the metric value 55 in this case
Thanks and Regards,
Yu Yang
what is the bucket of this metric
its a histogram metric, the upper bound of the bucket you want to query for certain time series is controlled by your “le” (less than or equals) parameter.
what is the unit of label “le”
le is the “less than or equals” bucket label its upper inclusive bound for the bucket and can be between 0 and infinity (+Inf)
and what is the unit of the metric value 55 in this case
running count of task attempts that fall under the histogram bucket for this metric with upper bound of 0.001
Typically you would use this histogram metric with a rate function over set period of time, for example
sum by (namespace, operation) (rate(task_attempt_bucket{le="+Inf"}[15m]))
hi tihomir, many thanks for the reply. My question is about the semantic meaning of this metric. So would like to understand the unit of label “le”, in this case “le”=0.001, or le=“50”, what is the unit of the number 0.001 and 50, is it seconds, or count etc?
This histogram metric measures the task_attempt distribution
Measure of “le” is a “count”, so for a similar example:
task_attempt_bucket{namespace="default",operation="TimerActiveTaskActivityTimeout",le="1000"} 82
means that there are 82 records with task_attempt count <= 1000
if you want the total count you can get value from task_attempt_count
which should be same value as in the task_attempt_bucket{..., le="+Inf"}
bucket.
I believe it uses default prometheus bucket distribution, and for this particular metric, le < 1 would not make sense because attempts start at 1.
1 Like
Thanks very much for the quick reply.