Workflow errors after adding custom search attribute

Hi, I’m currently testing temporal with php-sdk. I’m using mysql and elasticsearch, because I use these technologies in production. I’m using docker-compose setup for elasticsearch and mysql.
While testing, a weird thing happens to me. I add a custom search attribute via tctl and after this change, my workflows never end, because of a conversion error, which I found in the temporal logs. This is the log with stack trace:

:"Panic is captured","service":"history","sys-stack-trace":"goroutine 119430 [running]:\nruntime/debug.Stack(0x1ed6640, 0x1ed2c20, 0xc002491ec0)\n\t
/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngo.temporal.io/server/common/log.CapturePanic(0x2558138, 0xc000860ec0, 0xc0024a1760)\n\t
/temporal/common/log/panic.go:49 +0x95\npanic(0x1ed2c20, 0xc002491ec0)\n\t/usr/local/go/src/runtime/panic.go:965 +0x1b9\n
go.temporal.io/server/common.FromConfigToDefaultRetrySettings(0xc0005e4a20, 0x0, 0xc0005e4a20, 0xe000000000000001, 0xc00132a130)\n\t
/temporal/common/util.go:498 +0x23f\ngo.temporal.io/server/service/history.(*commandAttrValidator).validateActivityRetryPolicy(0xc0008c07c0, 0xc001ca7490, 0x0, 0x0)\n\t
/temporal/service/history/commandChecker.go:640 +0x5a\ngo.temporal.io/server/service/history.(*commandAttrValidator).validateActivityScheduleAttributes(0xc0008c07c0, 0xc0005a6720, 0x24, 0xc0005a6720, 0x24, 0xc001ca7490, 0x0, 0x106054f, 0xc000d4af40)\n\t
/temporal/service/history/commandChecker.go:193 +0x174\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommandScheduleActivity.func1(0x2514b00, 0xc000d4af40)\n\t/temporal/service/history/workflowTaskHandler.go:202 +0x77\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).validateCommandAttr(0xc0013e9970, 0xc0024a0a58, 0x2, 0x203000, 0x8)\n\t/temporal/service/history/workflowTaskHandler.go:921 +0x2b\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommandScheduleActivity(0xc0013e9970, 0xc001ca7490, 0xc0013e9970, 0x40e2f8)\n\t
/temporal/service/history/workflowTaskHandler.go:200 +0x2c5\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommand(0xc0013e9970, 0xc001cf75a8, 0xc0013e9970, 0xc0007c3d01)\n\t
/temporal/service/history/workflowTaskHandler.go:137 +0x7d\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommands(0xc0013e9970, 0xc00001ee10, 0x1, 0x1, 0xc0014c2fc0, 0xc0008c07c0)\n\t
/temporal/service/history/workflowTaskHandler.go:125 +0x59\ngo.temporal.io/server/service/history.(*workflowTaskHandlerCallbacksImpl).handleWorkflowTaskCompleted(0xc0004ee540, 0x254ea98, 0xc002491740, 0xc001cf7590, 0x0, 0x0, 0x0)\n\t
/temporal/service/history/workflowTaskHandlerCallbacks.go:419 +0x21b7\ngo.temporal.io/server/service/history.(*historyEngineImpl).RespondWorkflowTaskCompleted(0xc0008b8000, 0x254ea98, 0xc002491740, 0xc001cf7590, 0x26, 0x257cb78, 0xc0008b8000)\n\t
/temporal/service/history/historyEngine.go:1355 +0x52\ngo.temporal.io/server/service/history.(*Handler).RespondWorkflowTaskCompleted(0xc0005f4f00, 0x254ea98, 0xc002491740, 0xc001cf7590, 0x0, 0x0, 0x0)\n\t
/temporal/service/history/handler.go:486 +0x9c6\ngo.temporal.io/server/api/historyservice/v1._HistoryService_RespondWorkflowTaskCompleted_Handler.func1(0x254ea98, 0xc002491740, 0x21032c0, 0xc001cf7590, 0x1, 0x1, 0x7f4e0eb03301, 0xc002491800)\n\t
/temporal/api/historyservice/v1/service.pb.go:969 +0x8b\ngo.temporal.io/server/common/rpc/interceptor.(*RateLimitInterceptor).Intercept(0xc00088d320, 0x254ea98, 0xc002491740, 0x21032c0, 0xc001cf7590, 0xc000d4aee0, 0xc001cf7608, 0x3, 0x2515640, 0xc002491800, ...)\n\t
/temporal/common/rpc/interceptor/rate_limit.go:79 +0x179\ngoogle.golang.org/grpc.getChainUnaryHandler.func1(0x254ea98, 0xc002491740, 0x21032c0, 0xc001cf7590, 0x20645a0, 0xc002491710, 0x254ea98, 0xc002491740)\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1058 +0xe7\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept(0xc000973500, 0x254ea98, 0xc002491740, 0x21032c0, 0xc001cf7590, 0xc000d4aee0, 0xc001cfd280, 0x0, 0x0, 0x0, ...)\n\t
/temporal/common/rpc/interceptor/telemetry.go:95 +0x2fe\ngoogle.golang.org/grpc.getChainUnaryHandler.func1(0x254ea98, 0xc002491410, 0x21032c0, 0xc001cf7590, 0x1, 0xc001cfd240, 0xc001a8dac8, 0xc2dcc8)\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1058 +0xe7\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor(0x254ea98, 0xc002491410, 0x21032c0, 0xc001cf7590, 0xc000d4aee0, 0xc001cfd240, 0xc001cfd240, 0x18, 0x1f30140, 0xc002491401)\n\t
/temporal/common/rpc/grpc.go:100 +0x55\ngoogle.golang.org/grpc.chainUnaryServerInterceptors.func1(0x254ea98, 0xc002491410, 0x21032c0, 0xc001cf7590, 0xc000d4aee0, 0xc001cf7608, 0xc000ec5ba0, 0x50e086, 0x203b720, 0xc002491410)\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1044 +0xd0\ngo.temporal.io/server/api/historyservice/v1._HistoryService_RespondWorkflowTaskCompleted_Handler(0x21d2be0, 0xc0005f4f00, 0x254ea98, 0xc002491410, 0xc001a0e600, 0xc00060cb00, 0x254ea98, 0xc002491410, 0xc002498000, 0x3c4)\n\t
/temporal/api/historyservice/v1/service.pb.go:971 +0x150\ngoogle.golang.org/grpc.(*Server).processUnaryRPC(0xc0008fe700, 0x2567718, 0xc001aa2a80, 0xc000642f00, 0xc0008c8390, 0x332df90, 0x0, 0x0, 0x0)\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1217 +0x52b\ngoogle.golang.org/grpc.(*Server).handleStream(0xc0008fe700, 0x2567718, 0xc001aa2a80, 0xc000642f00, 0x0)\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1540 +0xd0c\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2(0xc00134d220, 0xc0008fe700, 0x2567718, 0xc001aa2a80, 0xc000642f00)\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:878 +0xab\ncreated by google.golang.org/grpc.(*Server).serveStreams.func1\n\t
/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:876 +0x1fd\n",
"error":"interface conversion: interface {} is int, not float64","logging-call-at":"panic.go:51","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/temporal/common/log/zap_logger.go:136\ngo.temporal.io/server/common/log.CapturePanic\n\t/temporal/common/log/panic.go:51\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:965\nruntime.panicdottypeE\n\t/usr/local/go/src/runtime/iface.go:261\ngo.temporal.io/server/common.FromConfigToDefaultRetrySettings\n\t/temporal/common/util.go:498\ngo.temporal.io/server/service/history.(*commandAttrValidator).validateActivityRetryPolicy\n\t/temporal/service/history/commandChecker.go:640\ngo.temporal.io/server/service/history.(*commandAttrValidator).validateActivityScheduleAttributes\n\t/temporal/service/history/commandChecker.go:193\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommandScheduleActivity.func1\n\t/temporal/service/history/workflowTaskHandler.go:202\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).validateCommandAttr\n\t/temporal/service/history/workflowTaskHandler.go:921\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommandScheduleActivity\n\t/temporal/service/history/workflowTaskHandler.go:200\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommand\n\t/temporal/service/history/workflowTaskHandler.go:137\ngo.temporal.io/server/service/history.(*workflowTaskHandlerImpl).handleCommands\n\t/temporal/service/history/workflowTaskHandler.go:125\ngo.temporal.io/server/service/history.(*workflowTaskHandlerCallbacksImpl).handleWorkflowTaskCompleted\n\t/temporal/service/history/workflowTaskHandlerCallbacks.go:419\ngo.temporal.io/server/service/history.(*historyEngineImpl).RespondWorkflowTaskCompleted\n\t/temporal/service/history/historyEngine.go:1355\ngo.temporal.io/server/service/history.(*Handler).RespondWorkflowTaskCompleted\n\t/temporal/service/history/handler.go:486\ngo.temporal.io/server/api/historyservice/v1._HistoryService_RespondWorkflowTaskCompleted_Handler.func1\n\t/temporal/api/historyservice/v1/service.pb.go:969\ngo.temporal.io/server/common/rpc/interceptor.(*RateLimitInterceptor).Intercept\n\t/temporal/common/rpc/interceptor/rate_limit.go:79\ngoogle.golang.org/grpc.getChainUnaryHandler.func1\n\t/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1058\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept\n\t/temporal/common/rpc/interceptor/telemetry.go:95\ngoogle.golang.org/grpc.getChainUnaryHandler.func1\n\t/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1058\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor\n\t/temporal/common/rpc/grpc.go:100\ngoogle.golang.org/grpc.chainUnaryServerInterceptors.func1\n\t/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1044\ngo.temporal.io/server/api/historyservice/v1._HistoryService_RespondWorkflowTaskCompleted_Handler\n\t/temporal/api/historyservice/v1/service.pb.go:971\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1217\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:1540\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.36.0/server.go:878"}

Is this an error or am I doing something wrong?

Steps to reproduce (the same thing happens in the sample apps, so I’m posting these steps regarding to the sample app):

  1. I run the sample app for the php-sdk: samples-php/README.md at master · temporalio/samples-php · GitHub
  2. everything works as intended while doing the same as in the readme
  3. I run tctl admin cluster add_search_attr --search_attr_key EventBusEventId --search_attr_type String
root@6eae32e00875:/var/app# tctl admin cluster add_search_attr --search_attr_key EventBusEventId --search_attr_type String
Are you trying to add key [EventBusEventId] with Type [String]? Y/N Y

Success
  1. the command ends with success, when I check the ES index pattern, the new attribute is there
  2. after running the sample app again, I’m getting this error:
root@6eae32e00875:/var/app# php app.php search-attributes
Starting GreetingWorkflow... 
Started: WorkflowID=2155cd60-bc85-45b0-a758-1d5b6e7e7d4f, RunID=3026a3cf-7e6d-46d4-a396-6d55c07c1e93

In WorkflowStub.php line 346:
                                                                                                                                                  
  workflowId='2155cd60-bc85-45b0-a758-1d5b6e7e7d4f', runId='3026a3cf-7e6d-46d4-a396-6d55c07c1e93', workflowType='SearchAttributes', retryState=3  
                                                                                                                                                  

In WorkflowStub.php line 351:
                                                     
  timeoutWorkflowType='TIMEOUT_TYPE_START_TO_CLOSE'  
                                                     

search-attributes

I haven’t even tried to use the new attribute. When browsing the temporal logs, I found the one posted above the reproduction steps.

Thanks for help! :slight_smile:

I think this has nothing to do with search attributes but related to the recent bug fix: Introduce Number util for dynamic config by wxing1292 · Pull Request #1464 · temporalio/temporal · GitHub. As a temporary fix, can you change your default retry policies in dynamic config to be float64, not int. I.e. if you have coefficients defined as 1 change them to 1.0 or 1.1.

thank you very much, that helped! :slight_smile: