Throwing Exception vs Failure in workflow

Hi all,

I have a question about the Failure hierarchy and what the best practice is for handling workflow errors. Are workflows supposed to throw only Failures, or is it acceptable to throw an exception (e.g. IllegalArgumentException) from workflow code?

I ask because in the unit tests, throwing a Failure seems to work fine but throwing an exception seems to put the unit test into a loop of retries with the below message. This example is from WorkflowTestingTest.testFailure(), altered to throw IllegalArgumentException instead of ApplicationFailure. The below message gets printed out repeatedly every few seconds

Thanks!
Scott

	09:15:43.431 [Workflow Executor taskQueue="test-workflow", namespace="default": 3] ERROR i.t.internal.worker.PollerOptions - uncaught exception
java.lang.RuntimeException: Failure processing workflow task. WorkflowId=c5ac197d-598b-4e8e-98f3-9aa6019380cd, RunId=d8585e55-106d-43fe-b5d4-fc436992f71f
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:337)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:275)
	at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:79)
	at io.temporal.internal.worker.PollTaskExecutor$$Lambda$76/0000000000000000.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:823)
Caused by: io.temporal.internal.replay.InternalWorkflowTaskException: Failure handling event 17 of 'EVENT_TYPE_WORKFLOW_TASK_STARTED' type. IsReplaying=false, PreviousStartedEventId=17, workflowTaskStartedEventId=17, Currently Processing StartedEventId=17
	at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:193)
	at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleEvent(ReplayWorkflowRunTaskHandler.java:140)
	at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:180)
	at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:150)
	at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithEmbeddedQuery(ReplayWorkflowTaskHandler.java:199)
	at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:114)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:309)
	at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:275)
	at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73)
	... 4 common frames omitted
Caused by: java.lang.RuntimeException: WorkflowTask: failure executing SCHEDULED->WORKFLOW_TASK_STARTED, transition history is [CREATED->WORKFLOW_TASK_SCHEDULED]
	at io.temporal.internal.statemachines.StateMachine.executeTransition(StateMachine.java:140)
	at io.temporal.internal.statemachines.StateMachine.handleHistoryEvent(StateMachine.java:91)
	at io.temporal.internal.statemachines.EntityStateMachineBase.handleEvent(EntityStateMachineBase.java:63)
	at io.temporal.internal.statemachines.WorkflowStateMachines.handleEventImpl(WorkflowStateMachines.java:210)
	at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:178)
	... 12 common frames omitted
Caused by: java.lang.IllegalArgumentException: test
	at io.temporal.internal.testing.WorkflowTestingTest$FailingWorkflowImpl.workflow1(WorkflowTestingTest.java:135)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at io.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation$RootWorkflowInboundCallsInterceptor.execute(POJOWorkflowImplementationFactory.java:289)
	at io.temporal.internal.sync.POJOWorkflowImplementationFactory$POJOWorkflowImplementation.execute(POJOWorkflowImplementationFactory.java:253)
	at io.temporal.internal.sync.WorkflowExecuteRunnable.run(WorkflowExecuteRunnable.java:52)
	at io.temporal.internal.sync.SyncWorkflow.lambda$start$0(SyncWorkflow.java:121)
	at io.temporal.internal.sync.SyncWorkflow$$Lambda$93/0000000000000000.run(Unknown Source)
	at io.temporal.internal.sync.CancellationScopeImpl.run(CancellationScopeImpl.java:104)
	at io.temporal.internal.sync.WorkflowThreadImpl$RunnableWrapper.run(WorkflowThreadImpl.java:111)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	... 3 common frames omitted

By default, any unexpected exception which doesn’t extend TemporalFailure doesn’t fail a workflow. It blocks its execution and periodically retries waiting for the fix. This is done to avoid failing workflows due to unexpected bugs like NPE. In most cases, users do not want a few million workflows to fail and require manual intervention on a silly NPE bug introduced by a new deployment.

If you indeed want to fail the workflow on a specific exception include it into WorkflowImplementationOptions.FailWorkflowImplementationTypes. For example, if you want your workflow to fail on any exception (which we very rarely recommend in production) set it to Throwable. The WorkflowImplementationOptions are passed to Worker.registerWorkflowImplementationTypes call.