Hi Team,
Before starting execution on the temporal server side, our Java SDK may encounter exceptions. We need a comprehensive list of these exceptions so that we can determine if they are eligible for SDK-side retry attempts, taking into account business logic. Any additional information in this regard would be highly appreciated.
So far, we have observed three main exceptions:
-
WorkflowExecutionAlreadyStarted: should not be retried (based on our business logic)
-
WorkflowNotFoundException: Also driven by our business logic, this exception should not be retried.
-
io.grpc.StatusRuntimeException: This exception is more generic and encompasses various scenarios with different message information. We have observed the following categories so far:
(1) DEADLINE_EXCEEDED: When the deadline exceeds after 9.999961238 seconds.
(2) NOT_FOUND: Namespace “xxx” is not found.
(3) INVALID_ARGUMENT: Namespace “xxx” has no mapping defined for the search attribute “xxx”.
(4) RESOURCE_EXHAUSTED: namespace rate limit exceeded. → this situation is eligible for retry attempts.
Questions:
- How does Temporal map internal exceptions to grpc exceptions? Is this fair to use the status code (NOT_FOUND / RESOURCE_EXHAUSTED) of the grpc exception to determine whether they should be retried on sdk side?
- Is there a comprehensive list of exception I can take a closer look? Thanks!