Redriving failed workflows from the point of failure

Does Temporal provide support for redriving a failed workflow from the point where it failed with the same state it had? We are using Go, if that matters.

My hypothetical scenario is this: we have a workflow with 10 activities, and it gets through 5 activities just fine. But it fails the 6th activity (including however many retries we allow). We later go and debug whatever caused the issue in activity 6. Let’s say it turns out we had an outage downstream with whatever API activity 6 was calling. Hours, or maybe days, after the workflow failure, we fix the issue and would like to create a new workflow starting from activity 6 with the state data that it had after activity 5, before activity 6 – because the first 5 activities took 2+ hours to run. I think the workflow execution history is stored, so we can get our hands on the data. We might want to do it manually, or we might want to be able to write a script that gets all the failed workflows and redrives them all.

Now we could implement this functionality ourselves. We have the necessary abstractions in place that we could have a separate, but coupled, “redrive” workflow where we could step directly into whatever activity we want that we execute either manually or programmatically (if a sequence of workflows failed for the same reason), but I’d only want to implement this if I’m not re-inventing the wheel.

Yes, you can do it using the reset operation. But we recommend don’t fail activities by not limiting retry duration. This way you don’t need to do any manual operations after the bug is fixed.