Replay for non-deterministic change

Kevin_Meng · June 20, 2023, 7:44am

I was trying to write some example code to understand how replay could detect non-deterministic workflow code change. The workflow code is like

import (
	"context"
	"time"

	"go.temporal.io/sdk/activity"
	"go.temporal.io/sdk/workflow"
)

type ActivityOutput struct {
	Output string
}

func Workflow(ctx workflow.Context, name string) error {
	ao := workflow.ActivityOptions{
		StartToCloseTimeout: 10 * time.Second,
	}
	ctx = workflow.WithActivityOptions(ctx, ao)

	logger := workflow.GetLogger(ctx)
	logger.Info("Workflow started", "name", name)

	var result ActivityOutput
	err := workflow.ExecuteActivity(ctx, ActivityA).Get(ctx, &result)
	if err != nil {
		logger.Error("Activity failed.", "Error", err)
		return err
	}

	logger.Info("sleep for a while")
	_ = workflow.Sleep(ctx, time.Second*10)
	return nil
}

func ActivityA(ctx context.Context) (ActivityOutput, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("enter ActivityA")
	return ActivityOutput{
		Output: "ActivityA",
	}, nil
}

func ActivityB(ctx context.Context) (ActivityOutput, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("enter ActivityB")
	return ActivityOutput{
		Output: "ActivityB",
	}, nil
}

func ActivityC(ctx context.Context) (ActivityOutput, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("enter ActivityC")
	return ActivityOutput{
		Output: "ActivityC",
	}, nil
}

After the workflow execution reached the sleep, I killed the worker, changed the code to execute ActivityB instead of ‘ActivityA’, then I started a new worker with the new workflow definition, hoping the new worker could pick up the previous workflow execution and replay event histories on this new workflow definition. My expectation was that it would cause a non-deterministic error since the replay expected ActivityB instead getting ActivityA from the event history. But actually the workflow execution run to complete successfully out of my expectation. I’m wondering why.

Here’s a screen shot of the workflow execution event history.

antonio.perez · June 21, 2023, 3:34pm

Hello @Kevin_Meng

I guess there are some edge cases in which the SDK does not compare the activity name on replay.

Can you try the following?

- activityA
- sleep
- activityB

when the workflow reaches sleep, stop the worker and swap activityA and activityB, and start the worker again

- activityB
- sleep
- activityA

When the timer fires you should get a non-deterministic error.

Let me know how it goes.
Antonio

Kevin_Meng · June 21, 2023, 10:41pm

I have the same doubt about this is an edge case.

The previous workflow definition is like

- ActivityA
- Sleep

then I change ActivityA to ActivityB and add a new ActivityC after the sleep like this

- ActivityB
- Sleep
- ActivityC

If I kill the previous worker when sleeping and replay on the new workflow definition, it causes non-deterministic error which is expected.

Topic		Replies	Views
Unable to Simulate Non-determinism Error Community Support go-sdk	0	330	July 24, 2023
Deterministic workflows, replay and activity arguments Community Support	2	261	May 28, 2024
Non-determinism issue while replaying mutable side-effect behind workflow versioning Community Support go-sdk , versioning , replay	3	665	June 16, 2023
Workflow Versioning + Changes to workflow itself Community Support go-sdk	4	1170	July 31, 2022
Why can't Workflows contain non deterministic code? And how does using activity solve the problem? Community Support java-sdk	10	3415	August 7, 2024

Replay for non-deterministic change

Related topics