Failed to activate workflow - Script execution timed out after 5000ms

I am running into this error inside a workflow.

2024-02-20T12:38:05.356Z [ERROR] Failed to activate workflow {
  namespace: 'stakex-1.aicyu',
  taskQueue: 'moonbeam:staking-queue',
  workflowId: 'StakeBlockWorkflow-Bt5tyQKtAjHFGIHyMpggK',
  runId: '226a14cd-2bfc-45de-83eb-e64a90511d6b',
  workflowType: 'stakeBlockWorkflow',
  error: Error: Script execution timed out after 5000ms
      at Script.runInContext (node:vm:134:12)
      at Object.runInContext (node:vm:282:6)
      at Proxy.<anonymous> (/home/ubuntu/stakexyz/stakexyz-staking/node_modules/@temporalio/worker/lib/workflow/reusable-vm.js:94:50)
      at ReusableVMWorkflow.activate (/home/ubuntu/stakexyz/stakexyz-staking/node_modules/@temporalio/worker/lib/workflow/vm-shared.js:313:33)
      at handleRequest (/home/ubuntu/stakexyz/stakexyz-staking/node_modules/@temporalio/worker/lib/workflow/workflow-worker-thread.js:50:47)
      at MessagePort.<anonymous> (/home/ubuntu/stakexyz/stakexyz-staking/node_modules/@temporalio/worker/lib/workflow/workflow-worker-thread.js:92:38)
      at [nodejs.internal.kHybridDispatch] (node:internal/event_target:786:20)
      at exports.emitMessage (node:internal/per_context/messageport:23:28),
  workflowExists: true

After the error happens, I keep running into this until I restart the workflow, at which point it is able to resume.

2024-02-20T12:45:16.685015Z WARN temporal_sdk_core::worker::workflow: Task not found when completing error=status: NotFound, message: "Workflow task not found.", details: [8, 5, 18, 24, 87, 111, 114, 107, 102, 108, 111, 119, 32, 116, 97, 115, 107, 32, 110, 111, 116, 32, 102, 111, 117, 110, 100, 46, 26, 66, 10, 64, 116, 121, 112, 101, 46, 103, 111, 111, 103, 108, 101, 97, 112, 105, 115, 46, 99, 111, 109, 47, 116, 101, 109, 112, 111, 114, 97, 108, 46, 97, 112, 105, 46, 101, 114, 114, 111, 114, 100, 101, 116, 97, 105, 108, 115, 46, 118, 49, 46, 78, 111, 116, 70, 111, 117, 110, 100, 70, 97, 105, 108, 117, 114, 101], metadata: MetadataMap { headers: {"content-type": "application/grpc", "server": "temporal", "date": "Tue, 20 Feb 2024 12:45:16 GMT"} } run_id="226a14cd-2bfc-45de-83eb-e64a90511d6b"

The original error does come after some “heavy” computation, so this might have to do with Temporal checking if the code has deadlocked (it hasn’t).

Can you confirm if this is indeed the reason? Can I increase the 5s limit?

This is indeed the deadlock detector being triggered.
There is no way to increase it. You should move the computation intensive code to an activity as workflow tasks should be performed quickly (the default and recommended time per task is 10 seconds).

Thanks for the clarification. It is not possible to move all the data to an activity within the time constraints that we have (this is a blockchain indexing app). Even if we do, then we would hit the max workflow size too fast as state history would explode.

One hacky solution is to call sleep(1) every N iterations inside the for-loop that does the time-consuming computation in the workflow.

How about this:

    const c = setInterval( async () => {
      await sleep(1)
    }, 4000);

    // time-consuming for-loop
    for () {
      // yield so that setInterval can run
      setTimeout(() => {}, 0);
    }

    clearInterval(c);

…on second thought, this does not seem deterministic cause it depends on cpu speed.

I highly recommend restructuring your workflow in a way that would support this.
Consider using a local activity if you’re concerned about latency but note that the input and result would need to be recorded in workflow history.

You might also be better off using a long running, heartbeating activity instead of a workflow.

I wouldn’t worry about deterministic execution in the TypeScript SDK sandbox, you can’t really break it without modifying workflow code.

The example you posted won’t work, to actually yield, you’ll want to put an await point in the loop.
You also don’t need the setInterval call, using await sleep(0) within the loop will get you what you want.

Yeap, I was calling sleep() in the loop but I was wondering if I could do it on a timer to avoid calling it more often than necessary.

Is there any way to “catch” the “Failed to activate workflow” error and exit the worker process?

Note that this would lead to very long recovery (replay) times for the workflow as all the CPU intensive logic is reexecuted.

1 Like