Dealing with multiple runaway Workflows

I have a Workflow called “processData”. When testing this Workflow, I unintentionally started a large number (100s-1000s) with a very large retry number and no timeout. Because the inputs are bad and the Workflow can’t process them, each run fails, and due to the retry and timeout configuration the Workflow Executions will keep running for a very large amount of time.

This problem leads me to two questions:

  1. Is there a way to simultaneously terminate all the Workflow Executions, perhaps with tctl?
  2. Can anyone offer guidance on retry and timeout configuration? Would any use cases require an infinite timeout?

Hello @llcooll

  1. use tctl batch start to terminate your running workflows tctl batch start | Temporal Documentation ie tctl batch start --query "ExecutionStatus=\"Running\"" --re "reason" --batch_type terminate. I would recommend you test the query first

tctl workflow l --query "your query"

If you have advance visibility enabled you can run more advanced queries using operators

tctl workflow l --query "ExecutionStatus=\"Running\" AND WorkflowType=\"GreetingWorkflow\""

  1. You can set WorkflowRunTimeout in your WorkflowOptions to limit the duration of your workflow run, after some time including workflow task retries the workflow will fail.