I spent quite some time for the last few months to learn about Temporal, and now I’m planning to deploy it, I’ve got some questions about retrying an activity. I read different topics without finding an answer, so here I come !
Let’s say I’ve a simple workflow, which starts an Activity A.
This Activity A just calls an external http API.
From what I understand from the different guidelines I read, the retry policy for the activity A should be “no max retry”, and if the external API is down, we will just retry until it’s up. This approach looks good to me.
But how can I be alerted if there is too many retries for the activity ? Is this something I should handle on my own ?
I want to easily know when the activity is failing and retrying indefinitely, to know if the external API is down, or if I made a mistake on the URL I’m calling, etc.
I’m looking for a way to list “all workflows where an activity has been retried more than 5 times” for example.
Setting up a MaxRetry policy for Activity A to 5 would answer this usecase, because I can easily list failed workflows, but it does not seem to be the best approach from what I read.
The goal is obviously to know when a strange behaviour is happening during an activity execution and look for the root cause.