When does Temporal write the ActivityTaskStarted event into workflow history?

The ActivityTaskStarted event is written into workflow history when your activity completes or fails after all of its retries.

This is due to an optimization Temporal does to avoid polluting the workflow execution history on activity retries.

Let’s say we had an activity that keeps failing and is being retried. Without this optimization our workflow execution history could look something like:

ActivityTaskScheduled
ActivityTaskStarted (attempt 1)
ActivityTaskFailed
ActivityTaskScheduled
ActivityTaskStarted (attempt 2)
ActivityTaskFailed
ActivityTaskScheduled
ActivityTaskStarted (attempt 3)
ActivityTaskFailed
...

Given that our execution history has a 50K event limit (if reached server would terminate our execution) an activity that retries many times could cause this limit to be reached.

The optimization to only write ActivityTaskStarted event when your activity completes or fails after all of its retries can be sometimes confusing. This is because looking at the workflow history alone it could look like your activity has never started, for example:

1. WorkflowExecutionStarted (server created and persisted the workflow execution)
2. WorkfowTaskScheduled (the first workflow task has been placed onto the task queue)
3. WorkfowTaskStarted (Workflow worker picked up the task and started executing your wf code)
4. WorkflowTaskCompleted (Worker executed your code until it reached code to invoke an activity)
5. ActivityTaskScheduled (Worker sent the activity task schedule command to server, server wrote this corresponding event into history)

At this point many users are not sure what’s going on, meaning is their activity executing or not?
Activities that are executing but have not completed yet (or are retrying) are called “pending activities”.
You can see pending activity information via tctl, for example:

tctl wf desc -w <wfid> | jq .pendingActivities

Here you will be able to see the pending activities ids, state, attempt, scheduleTime which is the scheduled time for the next activity retry, the worker identity thats executing the activity, as well as lastFailure information that includes the message (and stack trace) of the last activity failure.

In the Web ui you can also click on an activity execution (runid) and click on “Pending Activities” to see this information as well.

Via SDK apis you could write code that for example gives you all workflow executions that have pending activities where the attempt count is > X, here is an example of this using Java SDK. This can be sometimes useful for alerting purposes.

Hope this helps :slight_smile: