Potential impact of workflow update

fund MyWorkflow(...) ... {
    // Make sure that already running workflows will use DefaultVersion
    GetVersion("add_paypal", 1) // result is ignored here
    ...
    // at the end of the workflow
    v := GetVersion(ctx, “add_paypal”, DefaultVersion, 1)
    if v == DefaultVersion {
        err = workflow.ExecuteActivity(ctx, CreditCardPay).Get(ctx, nil)
    } else {
        err = workflow.ExecuteActivity(ctx, PaypalPay).Get(ctx, nil)
    }
}

for the first call GetVersion(“add_paypal”, 1) // result is ignored here
what will be the returned value? althought it is not important and ignored here

You are the first one to ask for this. So it is exceptional :).

different workflow instances should correspond to different workflow versions right?

This approach doesn’t work for the majority of the long running workflows. For example, a bug fix at the end of a workflow is not possible to fix for already running workflows using your approach.

If workflow already passed this statement before the code was changed the DefaultVersion is returned. If it is a new workflow and this code is executed for the first time the version 1 is returned.

will this be added in workflow v0 initially or it is added with the second getVersion and the workflow definition change and the if else check based on version

No, it is added only when changing the workflow. No need to add initially.

why this only has 2 params? shouldnt it have a min and max supported version #? should it be GetVersion(ctx, “add_paypal”, DefaultVersion, 1) as well?

I summarized as follow, am i understanding correctly?

// workflow V0
100: func PaymentWorkflow() {
        ...
150:    err = workflow.ExecuteActivity(ctx, PaymentCreditCard).Get(ctx, nil)
        ...
200: }
// workflow V1
100: func PaymentWorkflow() {
	 // Make sure that already running workflows will use DefaultVersion
101:    GetVersion(ctx, "paypal_change", DefaultVersion, 1) // call 1
        ...
151:    v := GetVersion(ctx, "paypal_change", DefaultVersion, 1) // call 2
152:    if v  == DefaultVersion {
153:       err = workflow.ExecuteActivity(ctx, PaymentCreditCard).Get(ctx, nil)
154:    } else {
155:       err = workflow.ExecuteActivity(ctx, PaymentPaypal).Get(ctx, nil)
156:    }
        ...
206: }

There are four scenarios for code path selection for workflow version V1:

  1. New workflow instances:
  • GetVersion call 2 will return version 1
  • Executing PaymentPaypal
  1. Running workflow instances which have not reached line 150 in V0 previously:
  • GetVersion call 2 will return version 0
  • Executing PaymentCreditCard
  1. Replaying a running workflow instance which has passed line 150 in V0 previously:
  • GetVersion call 2 will return version 0
  • Executing PaymentCreditCard
  1. Replaying a finished workflow instance:
  • GetVersion call 2 will return version 0
  • Executing PaymentCreditCard

so

  1. getversion1 and getversion2 will always return the same value for the same workflow because they have the same changeID
  2. it does not matter if the running workflow has reached line 150 or not, as long as it is already running before adding the paypal activity, the first and second getversion call will always return 0

appreciated!

Yes. My sample was wrong. It always has 3 parameters.

  1. Correct
  2. As long as it passed line 101 before the change the 0 will be returned.

thanks, one last question: what is the point to have the first get version call? given that the first and second call will always return the same value? Is it because we are probably going to have more changes in the future, not all getversion call have the same changeid param?

for my specific example, to only have the second call is enough right?

The reason for sharing changeId among multiple calls are coordinated changes. When workflow code is changed in multiple places and all these changes should be enabled together.

In your case the only reason to have the first call is to force second to return the DefaultVersion for workflows that already started. I don’t understand why you have this requirement. In the majority of the situations returning the latest version for the code which is executed for the first time is a better option.

not sure if my requirement is a corner case or not. there are definitely cases where we dont want the existing users to run through the new code. We thought that we have to wait all running instances to finish before redeploy the service. but seems getversion can solve the problem

I’m actually with you on this one, especially for long-running workflows (think several years).

In my ideal scenario, workflow instances are immutable:

  • When a workflow is started, it instantiates whatever workflow definition was currently available at the moment
  • If the workflow definition changes, the already-started workflows will never know about it (they have their own “snapshot” of the workflow definition they were asked to run)

I think AWS Simple Workflow provides something like this by leveraging task lists/queues.

This allows several things:

  • Determinism: a workflow instance will always know the state transitions it will have to take, depending on the data that it processes. No unexpected surprises.
  • No explicit version handling in the code. The workflow definition can evolve through time without any explicit checks/flags. This is especially important with long-running workflows (e.g. a workflow that is supposed to run for years and years). If we need to keep track of all these different versions explicitly in the code throughout the years, there is no limit as to how many version checks we need to put in place in the workflow definition to ensure everything is compatible.

In rare occasions we might desire to run existing workflow instances using a new workflow definition. In those cases is when we’d restart them explicitly, sort of like an explicit migration.

Such an approach is only possible if you keep all the versions of the code available for a very long time. In Java it could be implemented by using class loaders. In Go I"m not sure if you want to run that many worker versions for a very long time.

The simplest way to implement this using multiple worker pools is by using a different task queue name per version.

Yes, multiple task queues is the approach we’ll be taking for now, so we’ll have multiple workers running in parallel, and migrating old workflow instances to new versions as we go.

Soon you will endup with dozens of worker pools running for years. Not fun.

Yeah, that’s the drawback, but depending on the use case and setup of a specific team it can be preferable or not.

To us it is since having another worker just means an extra pod, and we can always do the migration if necesary. We prefer to move the complexity out of the code and into the “infrastructure”, but we’re fairly new users so we’ll see how it goes after we put this into production.