Replay for non-deterministic change

I was trying to write some example code to understand how replay could detect non-deterministic workflow code change. The workflow code is like

import (
	"context"
	"time"

	"go.temporal.io/sdk/activity"
	"go.temporal.io/sdk/workflow"
)

type ActivityOutput struct {
	Output string
}

func Workflow(ctx workflow.Context, name string) error {
	ao := workflow.ActivityOptions{
		StartToCloseTimeout: 10 * time.Second,
	}
	ctx = workflow.WithActivityOptions(ctx, ao)

	logger := workflow.GetLogger(ctx)
	logger.Info("Workflow started", "name", name)

	var result ActivityOutput
	err := workflow.ExecuteActivity(ctx, ActivityA).Get(ctx, &result)
	if err != nil {
		logger.Error("Activity failed.", "Error", err)
		return err
	}

	logger.Info("sleep for a while")
	_ = workflow.Sleep(ctx, time.Second*10)
	return nil
}

func ActivityA(ctx context.Context) (ActivityOutput, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("enter ActivityA")
	return ActivityOutput{
		Output: "ActivityA",
	}, nil
}

func ActivityB(ctx context.Context) (ActivityOutput, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("enter ActivityB")
	return ActivityOutput{
		Output: "ActivityB",
	}, nil
}

func ActivityC(ctx context.Context) (ActivityOutput, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("enter ActivityC")
	return ActivityOutput{
		Output: "ActivityC",
	}, nil
}

After the workflow execution reached the sleep, I killed the worker, changed the code to execute ActivityB instead of ‘ActivityA’, then I started a new worker with the new workflow definition, hoping the new worker could pick up the previous workflow execution and replay event histories on this new workflow definition. My expectation was that it would cause a non-deterministic error since the replay expected ActivityB instead getting ActivityA from the event history. But actually the workflow execution run to complete successfully out of my expectation. I’m wondering why.

Here’s a screen shot of the workflow execution event history.

Hello @Kevin_Meng

I guess there are some edge cases in which the SDK does not compare the activity name on replay.

Can you try the following?

- activityA
- sleep
- activityB

when the workflow reaches sleep, stop the worker and swap activityA and activityB, and start the worker again

- activityB
- sleep
- activityA

When the timer fires you should get a non-deterministic error.

Let me know how it goes.
Antonio

I have the same doubt about this is an edge case.

The previous workflow definition is like

- ActivityA
- Sleep

then I change ActivityA to ActivityB and add a new ActivityC after the sleep like this

- ActivityB
- Sleep
- ActivityC

If I kill the previous worker when sleeping and replay on the new workflow definition, it causes non-deterministic error which is expected.