Failing a workflow on a critical handled error vs completing the workflow and providing a status of something like 'completed with errors'

I was under the impression that a failed workflow is something rare and unexpected, because a durable design should handle all cases, and therefore ‘complete’. Should a non-retryable handled error fail or complete the workflow, or is it a matter of preference, or use case?

Hi,

a workflow is expected to fail if there is an unhandled error (eg. activity with maxAttempt can’t connect to the db and error after all retries are exhausted) but it won’t fail in presence of intermittent errors (like infrastructure or bugs in workflow code, but again it depends on the activity and workflow configuration) . See Temporal Failures reference | Temporal Platform Documentation

Should a non-retryable handled error fail or complete the workflow, or is it a matter of preference, or use case?

I guess the choice can depend on your monitoring needs and business requirements. Some developers prefer handling activity error and return workflow (successfully) completed with the error details, others let the workflow fails.

1 Like