My teammate tried to start 100K workflows in a for loop and all the workflow has become zombies like in the picture, and didn;t proceed or timing out. Checking the server find errors like this, any ideas what’s going on and how do we make sure high load wont result in workflow zombie state?
Zombie:
Error:
insertId: "9wjul5iv6bvxr2vjg"
jsonPayload: {
error: "context deadline exceeded"
level: "error"
logging-call-at: "workflowHandler.go:3383"
msg: "Unknown error"
service: "frontend"
stacktrace: "github.com/temporalio/temporal/common/log/loggerimpl.(*loggerImpl).Error
/temporal/common/log/loggerimpl/logger.go:138
github.com/temporalio/temporal/service/frontend.(*WorkflowHandler).error
/temporal/service/frontend/workflowHandler.go:3383
github.com/temporalio/temporal/service/frontend.(*WorkflowHandler).StartWorkflowExecution
/temporal/service/frontend/workflowHandler.go:494
github.com/temporalio/temporal/service/frontend.(*DCRedirectionHandlerImpl).StartWorkflowExecution.func2
/temporal/service/frontend/dcRedirectionHandler.go:1114
github.com/temporalio/temporal/service/frontend.(*NoopRedirectionPolicy).WithNamespaceRedirect
/temporal/service/frontend/dcRedirectionPolicy.go:116
github.com/temporalio/temporal/service/frontend.(*DCRedirectionHandlerImpl).StartWorkflowExecution
/temporal/service/frontend/dcRedirectionHandler.go:1110
github.com/temporalio/temporal/service/frontend.(*AccessControlledWorkflowHandler).StartWorkflowExecution
/temporal/service/frontend/accessControlledHandler.go:702
github.com/temporalio/temporal/service/frontend.(*WorkflowNilCheckHandler).StartWorkflowExecution
/temporal/service/frontend/workflowNilCheckHandler.go:112
go.temporal.io/temporal-proto/workflowservice._WorkflowService_StartWorkflowExecution_Handler.func1
/go/pkg/mod/go.temporal.io/temporal-proto@v0.23.1/workflowservice/service.pb.go:1015
github.com/temporalio/temporal/service/frontend.interceptor
/temporal/service/frontend/service.go:316
go.temporal.io/temporal-proto/workflowservice._WorkflowService_StartWorkflowExecution_Handler
/go/pkg/mod/go.temporal.io/temporal-proto@v0.23.1/workflowservice/service.pb.go:1017
google.golang.org/grpc.(*Server).processUnaryRPC
/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:1082
google.golang.org/grpc.(*Server).handleStream
/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:1405
google.golang.org/grpc.(*Server).serveStreams.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:746"
ts: "2020-07-09T01:14:14.567Z"