Non-determinism issue while replaying mutable side-effect behind workflow versioning

Hi team,

We have encountered a potential workflow replaying issue in the go-sdk. Need your help to validate this and suggest on potential fixes.

Issue Description
We are seeing non-determinism error when using mutable side-effects behind the workflow versioning markers.

PanicError: lookup failed for scheduledEventID to activityID: scheduleEventID: 13, activityID: 13 
process event for example-task-queue [panic]:
	/Users/nitesh/go/pkg/mod/*commandsHelper).handleActivityTaskScheduled(0x14001a315e0, {0x14001507798, 0x2}, 0x40?)
	/Users/nitesh/go/pkg/mod/ +0xf0*workflowExecutionEventHandlerImpl).ProcessEvent(0x1400089ef90, 0x14001a29b00, 0x20?, 0x0)
	/Users/nitesh/go/pkg/mod/ +0x248*workflowExecutionContextImpl).ProcessWorkflowTask(0x140015fd3b0, 0x1400137c600)
	/Users/nitesh/go/pkg/mod/ +0xed0*workflowTaskHandlerImpl).ProcessWorkflowTask(0x14000cd6c30, 0x1400137c600, 0x140018daa20)
	/Users/nitesh/go/pkg/mod/ +0x364*workflowTaskPoller).processWorkflowTask(0x14000ccc900, 0x1400137c600)
	/Users/nitesh/go/pkg/mod/ +0x2d0*workflowTaskPoller).ProcessTask(0x14000ccc900, {0x106c9dac0?, 0x1400137c600})
	/Users/nitesh/go/pkg/mod/ +0x70*baseWorker).processTask(0x14000ca28c0, {0x106c9d640, 0x140019515a0})
	/Users/nitesh/go/pkg/mod/ +0x15c
created by*baseWorker).runTaskDispatcher
	/Users/nitesh/go/pkg/mod/ +0x98

How to Reproduce
Try to replay workflow history or perform reset post-side-effect marker.

I have created a sample workflow to reproduce the same.

// Workflow is a Hello World workflow definition.
func Workflow(ctx workflow.Context, name string) (string, error) {
	uid := ""
	logger := workflow.GetLogger(ctx)
	logger.Info("HelloWorld workflow started", "name", name)

	v := workflow.GetVersion(ctx, "mutable-side-effect-bug", workflow.DefaultVersion, 1)
	if v == 1 {
		var err error
		uid, err = generateUUID(ctx)
		if err != nil {
			logger.Error("failed to generated uuid", "Error", err)
			return "", err

		logger.Info("generated uuid", "uuid-val", uid)

	ao := workflow.ActivityOptions{
		StartToCloseTimeout: 10 * time.Second,
	ctx = workflow.WithActivityOptions(ctx, ao)

	var result string
	err := workflow.ExecuteActivity(ctx, Activity, name).Get(ctx, &result)
	if err != nil {
		logger.Error("Activity failed.", "Error", err)
		return "", err

	logger.Info("HelloWorld workflow completed.", "result", result)

	return uid, nil

func generateUUID(ctx workflow.Context) (string, error) {
	var generatedUUID string

	err := workflow.MutableSideEffect(ctx, "generate-random-uuid", func(ctx workflow.Context) interface{} {
		return uuid.NewString()
	}, func(a, b interface{}) bool {
		return a.(string) == b.(string)
	if err != nil {
		return "", err

	return generatedUUID, nil

func Activity(ctx context.Context, name string) (string, error) {
	logger := activity.GetLogger(ctx)
	logger.Info("Activity", "name", name)
	return "Hello " + name + "!", nil

Potential Hypothesis
As part of this change, we are not recording side effect marker command while replaying based on entry in wc.mutableSideEffectsRecorded map which works in normal scenario. However, there is a catch, it seems while replaying GetVersion doesn’t generate a command explicitly (don’t know the reason). This is leading to mis-match in wc.commandsHelper.nextCommandEventID which is used as key to perform marker record lookup and hence leading to non-determinism.

Again, this is based on the debugging that I was able to do based on my limited knowledge. Please feel free to correct me if I am wrong. Thanks.

1 Like

cc: @Chad_Retz

This looks like a bug and that’s a great simple replication, thanks! Can you open an issue on the Go SDK repo with this same information?

Sure, thanks for confirming.