Expiring Workflow -- How it behaves when changing activities

Lets say i have a workflow that runs every 30 days for a year, that consists of three steps:

  1. x = execute_activity(a)
  2. if x: execute_activity(b)
  3. execute_activity(c)

what if i want to remove step (2) – how will:

1- the workflows that’s already waiting, will it break?

2- if for some reason during removing step(2) a workflow was running – how will it behave?

TIA :pray:

Hi,

You should have a look at the Versioning documentation, which addresses exactly this problem. There are several possibilities depending on which version of Temporal you are using. The classical approach would be via workflow.GetVersion, a so called version gate. To remove step two you would do something like:

v := workflow.GetVersion(ctx, "remove-b", workflow.DefaultVersion, 1)
if v == workflow.DefaultVersion {
        if x: execute_activity(b)
}

Once all workflows running on the old version have completed, you can remove the version gate.

A new approach is worker versioning where you basically have multiple versions of the worker running each responsible for running a specific version of the workflow. Wether this is an option in your case depends on which version/flavour of Temporal you are using.

1 Like

We actually experimented with this in (staging then production) environment, and the behavior was completely different, the workflow skipped step 2 entirely, as if it didn’t exist.

So what we did was:

  • Kept activity b active for a few minutes.

  • Updated the workflow by removing step 2.

  • After a short delay (to ensure no active just-in-time executions were using it), we removed the activity.

My assumption is that the workflow fetches its JSON definition at runtime — but only after the start_delay time passes. That would explain why it behaved this way.

temporal version (python sdk): temporalio==1.8.0

Hi,

You can find a sample code for python here describing what @hferentschik mentioned above.

Once all workflows running on the old version have completed, you can remove the version gate.

just to add, you can use the search attribute TemporalChangeVersion to search for and filter Workflow Executions based on which version or patch they have. You should maintain the code if you plan to query the workflows after they complete

Sorry can you share more about the behavior described in your last comment?

if i understand correctly, on the beginning of every workflow, temporal fetches the workflow body json, where if it got changed midway, it detects that there’s a non-deterministic behaviour. so in my behaviour, since the workflow is going to be executed after 30 days, temporal doesn’t fetch the workflow body, until the 30 days or whatever the start_delay passes.

if the workflow execution is not in the worker cache the worker needs to fetch the workflow history to recover the workflow state (workflow replay) before doing anything else, this can happen in two situations:

  • the event history change: new event is added (like a signal or timer fires) and the server creates a workflow task to dispatch the event to the worker
  • workflow.queyr . Event if queries are not written in the event history, they are dispatched to workers and workers needs to have the lates workflow state before running the query.

and NDE only happen during workflow replay (when the worker rerun workflow code to recover the workflow state)

1 Like

My assumption is that the workflow fetches its JSON definition at runtime — but only after the start_delay time passes. That would explain why it behaved this way.

Temporal doesn’t have a conept of “Workflow JSON”.

1 Like