Workflow task retry seems to be dangerous and hard to understand, and what are application's best practice?

maxim · July 20, 2022, 4:58pm

I always think the first priority is to let the broken thing stop first and resume after human’s repair.

This is exactly the intent. The workflow doesn’t fail, but it is blocked until the bug is resolved. I agree that the original implementation of retrying the workflow task without backoff is not perfect.

Release v1.17.0 added backoff logic to workflow task retries. See pull request #2765.
We will add the ability to automatically suspend workflow execution after a certain number of workflow task retries. The batch command to unsuspend will be supported.

Note that you can configure your workflows to fail on any specific exception type. If you specify Throwable as such exception type, the workflow will fail on any unhandled exception.

And the cloud doesn’t support batch termination makes our system worse.

Batch will be supported in the cloud later this year.

Topic		Replies	Views
Documentation on retries when throwing errors is not clear Community Support error-handling , activity , workflow-options , typescript-sdk , failures	0	200	March 13, 2024
Workflow Retry - Workflow should skip Activities which are successful in previous run Developer Corner java-sdk	8	1025	October 18, 2024
Retrying a workflow for a specific error scenario Community Support	21	4660	February 16, 2024
Workflow retry "ContinueAsNew" Community Support	1	531	March 9, 2021
How to configure Workflow Task Execution retries? Community Support java-sdk , workflow-options	0	27	April 21, 2025

Workflow task retry seems to be dangerous and hard to understand, and what are application's best practice?

Related topics