I am trying to create a metric to detect non determinism errors, but I see there are a lot of metrics (workflow_task_execution_failed, workflow_failed and workflow_task_queue_poll_failed), I am not sure which one should I use and I can’t find any details about these metrics anywhere.
I tried to create a workflow then changed its history to make the workflow be in a non-deterministic state, then when checking metrics I found that “workflow_task_queue_poll_failed” increased, while I can’t find “workflow_task_execution_failed” at all in the metrics, at the same time I don’t see “workflow_task_queue_poll_failed” metric in the documentation.
Are there some details about these metrics somewhere indicating the purpose of each one?
Are there some details about these metrics somewhere indicating the purpose of each one?
Docs have page here if it helps, but yeah as you said it might not be up to date to show all sdk metrics and their properties currently. We will work on updating it
Thank you for your response, but I can’t see this metric at all in the emitted metrics, any idea why?
Do I need to be on specific version of temporal to start seeing it?