Hello all,
We ran into an instance where there was a mismatch between the workflow run_id reported on the UI and current_executions, which caused the UI to fail to execute any commands we issue.
Summary of the problem
The image above depicts a workflow execution with a run_id of 4fb406f8-0594-465c-b88c-ba721d7d6335
, but when we click reset, the command is issued against the run_id of fe5a8691-08a4-41b5-b669-2fa28661aeb4
and results in failure.
The run_id 4fb406f8-0594-465c-b88c-ba721d7d6335
matches that of the executions
table, whereas the run_id fe5a8691-08a4-41b5-b669-2fa28661aeb4
matches that of the current_executions
table.
I’ve also attempted to execute terminate via tctl
. The results were the same:
Error: Terminate workflow failed.
Error Details: Workflow executionsRow not found. RunId: fe5a8691-08a4-41b5-b669-2fa28661aeb4
Because the run_id is not tracked in current_executions, it’s effectively orphaned and not making progress. What causes this to happen?
Expected behavior
- Workflow will continue to make progress
- Commands to cancel, terminate, or reset workflow execution succeeds
Thank you!