Hi,
I’m looking to clarify the behavior of Workflow.getVersion() and understand when it’s safe to fully remove it from workflow code.
Let’s say we introduce changes behind a version flag using Workflow.getVersion(), and later confirm that no workflows are running the old version. We then update the minSupported argument accordingly.
However, since getVersion() writes a marker to the event history, it seems that removing the call altogether could break determinism during workflow replay—because some workflows may still expect that marker to be present.
So my questions are:
When is it actually safe to remove getVersion()?
Is there a recommended process or API for deprecating/removing it cleanly without affecting in-flight workflows?
Since SDK release 1.30 sdk now upserts TemporalChangeVersion search attribute automatically.
To know when to remove can check visibility, for example:
TemporalChangeVersion="<change_id>-<version>" AND ExecutionStatus="Running"
Note that if for your use case you have need to query completed executions, you have to wait until also completed executions with this criteria are removed by namespace retention before you can safely remove your getVersion call in code.
Thanks for the response! One follow-up: If the service is already deployed and continues to receive traffic, we’ll never fully drain the workflows. There will always be new executions starting with the version marker node.
So unless we adopt a deployment strategy (maybe using worker versioning) that isolates the old code (with the patch version API) from the new code (without it), we won’t be able to safely remove the Workflow.getVersion call, even if we clean up the corresponding branches.
Is that understanding correct? Do we always need to use patch versioning alongside worker versioning to safely roll out workflow changes?
if you use worker versioning you don’t necessarily need patching if you go with PinnedVersion
If the service is already deployed and continues to receive traffic, we’ll never fully drain the workflows. There will always be new executions starting with the version marker node.
the query TemporalChangeVersion="<change_id>-<version>" will allow you to identify when maintaining paths with old versions is not really needed
Also removing getVersion itself shouldn’t case a NDE if the rest of the code is deterministic, at lest in the latest SDK version that I have tested
We can safely remove the Workflow.getVersion() line from the code without affecting any in-flight workflows, as long as the minVersion in the API is set to Workflow.DEFAULT_VERSION.
Here’s the sequence that led to the issue I was facing:
Step 1: Initial rollout with versioning logic
void someWorkflowMethod() {
int version = Workflow.getVersion("some-change", Workflow.DEFAULT_VERSION, 1);
if (version == 1) {
// new logic
} else {
// old logic
}
}
Step 2: After all workflows using the old logic were drained
void someWorkflowMethod() {
Workflow.getVersion("some-change", 1, 1);
// new logic
}
Step 3: This caused a Non-Deterministic Error (NDE)
This step failed for in-flight workflows that started before the change but executed this code:
void someWorkflowMethod() {
// new logic only
}
The mistake I made was in Step 2 — I changed the minVersion to 1 instead of removing the versioning line entirely.
If I had just removed the getVersion() line in Step 2, it would have worked fine. That’s what caused the NDE in my case.