I’m working on a workflow that identifies “stale” entries – entries that have remained in a certain state for more than 4 hours – and then processes them according to our business needs.
Right now, the logic that determines staleness is inside an activity, which queries the DB for records where created_at <= now() - interval '4 hours'
. We are considering two approaches:
- Let the activity compute “now” internally using datetime.utcnow()
- Have the workflow determine “now” via
workflow.now()
, then pass it to the activity as an input, so that time-based decisions are tied to the workflow’s controlled view of time
There is an argument for each:
- Option 1 reflects real-world time and ensures that retries of the activity always use the most recent clock otherwise we risk introducing a bug by making the activity work with a timestamp calculated in the workflow and reused across activity retries
- Option 2 gives the workflow full control over the meaning of “now”, which makes workflow retries consistent and allows for deterministic, testable behavior (especially useful with
start_time_skipping
tests)
The clarifications I wish to get are:
- Does the workflow become non-deterministic by calculating a timestamp inside an activity? Or does it stay deterministic because in the end is till executes the same series of activities, regardless of what the activities are internally doing?
- Is it reasonable or recommended practice to pass
workflow.now()
into an activity when time-sensitive decisions like staleness are involved, keeping in mind activity and workflow retries?