We’ve been using Temporal for a while now and have developed a way to have global and per-user feature flags that control the execution of Workflows. In order to ensure that workflows execute consistently to completion, we check the value of the flag as an Activity so that it’s result is recorded in the workflow event history. We’ve developed a little helper function that does this check that looks like the following:
// this function is called at the workflow level to determine
// if a flag is set for a user / globally
func IsEnabled(ctx workflow.Context, name string) (bool, error) {
// if the workflow executing is before we cared about this flag, assume it is not
// set.
v := workflow.GetVersion(ctx, name, workflow.DefaultVersion, 1)
if v < 1 {
return false, nil
}
var val bool
err := workflow.ExecuteActivity(ctx, (*Activities).IsFeatureFlagEnabled, name).Get(ctx, &val)
return val, err
}
The trouble comes in when we want to remove the flag from the system. We assume that it has been set to true for all users and globally, but removing this workflow function would mean determinism errors as in-flight workflow would suddenly not have a replay step for the IsFeatureFlagEnabled activity.
What we’ve been doing is something like the following as an interim step while all in-flight work completes (error handling elided for brevity):
// workflow source, before flag removal
useV2, _ := ff.IsEnabled(ctx, "my-flag-name")
if useV2 { newBehavior() } else { oldBehavior() }
// workflow source, during flag removal
version := workflow.GetVersion(ctx, "removeMyFlag", workflow.DefaultVersion, 1)
if v < 1 {
// ensure we still do a "flag lookup" for in-flight workflows,
// but assume they are always returning true, since the flag is
// rolled out - post-release workflows will stop this check
_, _ := ff.IsEnabled(ctx, "my-flag-name")
}
newBehavior()
// workflow source, after all in-flight workflows finish
newBehavior()
What I’d like is a programming model similar to workflow.GetVersion itself, where it’s safe for me to simply drop the feature flag check once I am sure no in flight workflows are using the old path, without the absence of an IsFeatureFlagEnabled activity causing a determinism error.
Are SideEffect
or LocalActivityExecution
a solution here, or do they both risk the same determinism error since they write to the event history as well? Is there some other solution I should look at?