Hey there! I’m trying to better understand why my workflow is hitting a NonDeterministicWorkflowError and how I can update the definition to avoid it.
My workflow is defined using the coinbase temporal-ruby SDK.
The workflow code looks like this:
class RecoverWorkflow
def execute(id, sharding_num)
shard_num = sharding_num || 1
status_futures = (0..shard_num - 1).map { |index| RecoverActivity.execute(id, index) }
statuses = status_futures.map(&:get)
all_success =
statuses
.map
.with_index do |s, i|
success = s == true
$logger.info("recovery job #{id} failed on shard index #{i}: #{s}") unless success
success
end
.all?
RecoverTimestampActivity.execute(id, all_success).get
return all_success
end
end
Error:
"type": "Temporal::NonDeterministicWorkflowError",
"nonRetryable": false,
"details": {
"payloads": [
"Unexpected command. The replaying code is issuing: workflow (17), but the history of previous executions recorded: activity (17). Likely, either you have made a version-unsafe change to your workflow or have non-deterministic behavior in your workflow. See https://docs.temporal.io/docs/java/versioning/#introduction-to-versioning."
]
Workflow Execution History:
From the history, my understanding is that the first RecoverActivity execution errored out due to our StartToClose timeout of 10 minutes and succeeded on the second attempt.
Is the non-determinism here caused by the fact that the input to the RecoverTimestampActivity can change based on RecoverActivity success or something else?