Clarification on Workflow.getVersion and When It’s Safe to Remove

Hi,
I’m looking to clarify the behavior of Workflow.getVersion() and understand when it’s safe to fully remove it from workflow code.

Let’s say we introduce changes behind a version flag using Workflow.getVersion(), and later confirm that no workflows are running the old version. We then update the minSupported argument accordingly.

However, since getVersion() writes a marker to the event history, it seems that removing the call altogether could break determinism during workflow replay—because some workflows may still expect that marker to be present.

So my questions are:

  • When is it actually safe to remove getVersion()?
  • Is there a recommended process or API for deprecating/removing it cleanly without affecting in-flight workflows?

Thanks in advance!

Since SDK release 1.30 sdk now upserts TemporalChangeVersion search attribute automatically.
To know when to remove can check visibility, for example:

TemporalChangeVersion="<change_id>-<version>" AND ExecutionStatus="Running"

Note that if for your use case you have need to query completed executions, you have to wait until also completed executions with this criteria are removed by namespace retention before you can safely remove your getVersion call in code.

Thanks for the response! One follow-up: If the service is already deployed and continues to receive traffic, we’ll never fully drain the workflows. There will always be new executions starting with the version marker node.

So unless we adopt a deployment strategy (maybe using worker versioning) that isolates the old code (with the patch version API) from the new code (without it), we won’t be able to safely remove the Workflow.getVersion call, even if we clean up the corresponding branches.

Is that understanding correct? Do we always need to use patch versioning alongside worker versioning to safely roll out workflow changes?

Hi

if you use worker versioning you don’t necessarily need patching if you go with PinnedVersion

If the service is already deployed and continues to receive traffic, we’ll never fully drain the workflows. There will always be new executions starting with the version marker node.

the query TemporalChangeVersion="<change_id>-<version>" will allow you to identify when maintaining paths with old versions is not really needed

Also removing getVersion itself shouldn’t case a NDE if the rest of the code is deterministic, at lest in the latest SDK version that I have tested

Thanks for the clarification!

I tested this out, and it works as expected.

We can safely remove the Workflow.getVersion() line from the code without affecting any in-flight workflows, as long as the minVersion in the API is set to Workflow.DEFAULT_VERSION.

Here’s the sequence that led to the issue I was facing:

Step 1: Initial rollout with versioning logic

void someWorkflowMethod() {
    int version = Workflow.getVersion("some-change", Workflow.DEFAULT_VERSION, 1);
    if (version == 1) {
        // new logic
    } else {
        // old logic
    }
}

Step 2: After all workflows using the old logic were drained

void someWorkflowMethod() {
    Workflow.getVersion("some-change", 1, 1);
    // new logic
}

Step 3: This caused a Non-Deterministic Error (NDE)

This step failed for in-flight workflows that started before the change but executed this code:

void someWorkflowMethod() {
    // new logic only
}

The mistake I made was in Step 2 — I changed the minVersion to 1 instead of removing the versioning line entirely.

If I had just removed the getVersion() line in Step 2, it would have worked fine. That’s what caused the NDE in my case.

This behavior seems to work only from Temporal version 1.28.0 onward.

Before that, even if minVersion was set to Workflow.DEFAULT_VERSION, I still ran into NDE after removing the Workflow.getVersion() call.