Application error handling

I was under the impression “NewNonRetryableApplicationError” means that a “NonRetryable” error has occurred so we should fail the workflow completely. Rather I see a new workflow started as “Continuedasnew”.

Given a workflow with activity A, B, C, D, E, F - any activity that returns a nonretryable error should cause the workflow to process the error and then hard fail.

Use case is with financial transactions;

makeFinancialTransactionWithThirdPartyActivity: = workflow.ExecuteActivity(ctx, a.FinancialTransactionActivity)
err = makeFinancialTransactionWithThirdPartyActivity.Get(ctx, & transaction)
if err != nil {
    cancelErr: = workflow.ExecuteActivity(ctx, a.RevertFinancialTransactionInternalActivity).Get(ctx, nil)
    if cancelErr != nil {
        return nil, cancelErr
    }

    return nil,
    err
}

When FinancialTransactionActivity returns a hard error we want to revert a transaction on our end with RevertFinancialTransactionInternalActivity.

Currently when “NewNonRetryableApplicationError” is returned the workflow is recreated.

What is the best way of handling real hard errors?

1 Like

TL;DR: Errors are wrapped and to ensure that workflow fails without retrying return a new application error that wraps the one from ExecuteActivity:

makeFinancialTransactionWithThirdPartyActivity: = workflow.ExecuteActivity(ctx, a.FinancialTransactionActivity)
err = makeFinancialTransactionWithThirdPartyActivity.Get(ctx, & transaction)
if err != nil {
    cancelErr: = workflow.ExecuteActivity(ctx, a.RevertFinancialTransactionInternalActivity).Get(ctx, nil)
    if cancelErr != nil {
        return nil, cancelErr
    }
    return temporal.NewNonRetryableApplicationError("Failing workflow", "activityFailure",   err)
}

Temporal Failure Chaining

Temporal chains failures as they propagate across process boundaries. This ensures that enough information is attached to a failure to track it to a specific activity or child workflow. Here is a modified HelloWorld example:

func Workflow(ctx workflow.Context, name string) (string, error) {
	ao := workflow.ActivityOptions{
		ScheduleToStartTimeout: time.Minute,
		StartToCloseTimeout:    time.Minute,
	}
	ctx = workflow.WithActivityOptions(ctx, ao)

	logger := workflow.GetLogger(ctx)
	logger.Info("HelloWorld workflow started", "name", name)

	var result string
	err := workflow.ExecuteActivity(ctx, Activity, name).Get(ctx, &result)
	if err != nil {
		logger.Error("Activity failed.", "Error", err)
		return "", err
	}

	logger.Info("HelloWorld workflow completed.", "result", result)

	return result, nil
}

func Activity(ctx context.Context, name string) (string, error) {
	return "", temporal.NewNonRetryableApplicationError("simulated failure", "application", nil)
}

Here is the output from logger.Error in the workflow code:

Error activity task error (scheduledEventID: 5, startedEventID: 6, identity: 66207@maxpro.local@): simulated failure

Note that it contains not only original “simulated failure”, but also information about the activity task that returned it.
Here is how error returned to the workflow looks like in the debugger:
Screen Shot 2020-10-07 at 1.54.17 PM
It clearly shows that the original application error is wrapped in an ActivityError which contains the activity-specific information.

Without chaining the code like

	var result string
	err := workflow.ExecuteActivity(ctx, Activity1, name).Get(ctx, &result)
	if err != nil {
		return "", err
	}
	err = workflow.ExecuteActivity(ctx, Activity2, name).Get(ctx, &result)
	if err != nil {
		return "", err
	}
	err = workflow.ExecuteActivity(ctx, Activity3, name).Get(ctx, &result)
	if err != nil {
		return "", err
	}

would fail the workflow without any indication which particular activity caused the failure.

Chaining works across SDKs in different languages. It is possible to chain an exception from a Java activity, Go child workflow, and Java parent seamlessly.

Non-retryable error not failing workflow problem

As workflow returns ActivityError instead of original ApplicaitonError the retry policy treats it as retryable. So the solution would be returning a new non-retryable ApplicaitonError which wraps the ActivityError.

makeFinancialTransactionWithThirdPartyActivity: = workflow.ExecuteActivity(ctx, a.FinancialTransactionActivity)
err = makeFinancialTransactionWithThirdPartyActivity.Get(ctx, & transaction)
if err != nil {
    cancelErr: = workflow.ExecuteActivity(ctx, a.RevertFinancialTransactionInternalActivity).Get(ctx, nil)
    if cancelErr != nil {
        return nil, cancelErr
    }
    return temporal.NewNonRetryableApplicationError("Failing workflow", "activityFailure",   err)
}
1 Like

Thanks for the detailed explanation as always :clap:t5:

I’ve refactored the code to;

makeFinancialTransactionWithThirdPartyActivity: = workflow.ExecuteActivity(ctx, a.FinancialTransactionActivity)
err = makeFinancialTransactionWithThirdPartyActivity.Get(ctx, & transaction)
if err != nil {
    cancelErr: = workflow.ExecuteActivity(ctx, a.RevertFinancialTransactionInternalActivity).Get(ctx, nil)
    if cancelErr != nil {
        var applicationErr * temporal.ApplicationError
        if errors.As(cancelErr, & applicationErr) {
            if applicationErr.NonRetryable() {
                return nil, temporal.NewNonRetryableApplicationError("Workflow failed", "RevertFinancialTransactionInternalActivity activityFailure", err)
            }
        }

        return nil, cancelErr
    }
    var applicationErr * temporal.ApplicationError
    if errors.As(err, & applicationErr) {
        if applicationErr.NonRetryable() {
            return nil, temporal.NewNonRetryableApplicationError("Workflow failed", "FinancialTransactionActivity activityFailure", err)
        }
    }

    return nil,
    err
}
1 Like