Hello!
I’ve been encountering the context deadline exceeded
error regularly while using the ExecuteWorkflow
method. I call it with a custom context.WithTimeout
where the timeout is set to 1 minute. However, the error occurs 10 seconds after making the call in all cases.
Additionally, the temporal-history
logs the context deadline exceeded
error along with the following stack trace:
go.temporal.io/server/common/log.(*zapLogger).Error
/home/builder/temporal/common/log/zap_logger.go:144
go.temporal.io/server/service/history/workflow.createWorkflowExecution
/home/builder/temporal/service/history/workflow/transaction_impl.go:346
go.temporal.io/server/service/history/workflow.(*ContextImpl).CreateWorkflowExecution
/home/builder/temporal/service/history/workflow/context.go:349
go.temporal.io/server/service/history/api/startworkflow.Invoke
/home/builder/temporal/service/history/api/startworkflow/api.go:94
go.temporal.io/server/service/history.(*historyEngineImpl).StartWorkflowExecution
/home/builder/temporal/service/history/historyEngine.go:424
go.temporal.io/server/service/history.(*Handler).StartWorkflowExecution
/home/builder/temporal/service/history/handler.go:529
go.temporal.io/server/api/historyservice/v1._HistoryService_StartWorkflowExecution_Handler.func1
/home/builder/temporal/api/historyservice/v1/service.pb.go:1046
go.temporal.io/server/common/rpc/interceptor.(*RetryableInterceptor).Intercept.func1
/home/builder/temporal/common/rpc/interceptor/retry.go:63
go.temporal.io/server/common/backoff.ThrottleRetryContext
/home/builder/temporal/common/backoff/retry.go:194
go.temporal.io/server/common/rpc/interceptor.(*RetryableInterceptor).Intercept
/home/builder/temporal/common/rpc/interceptor/retry.go:67
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1162
go.temporal.io/server/common/rpc/interceptor.(*RateLimitInterceptor).Intercept
/home/builder/temporal/common/rpc/interceptor/rate_limit.go:86
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept
/home/builder/temporal/common/rpc/interceptor/telemetry.go:142
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/metrics.NewServerMetricsTrailerPropagatorInterceptor.func1
/home/builder/temporal/common/metrics/grpc.go:113
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/metrics.NewServerMetricsContextInjectorInterceptor.func1
/home/builder/temporal/common/metrics/grpc.go:66
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.UnaryServerInterceptor.func1
/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.36.1/interceptor.go:352
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/rpc.ServiceErrorInterceptor
/home/builder/temporal/common/rpc/grpc.go:137
google.golang.org/grpc.chainUnaryInterceptors.func1.1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
google.golang.org/grpc.chainUnaryInterceptors.func1
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1167
go.temporal.io/server/api/historyservice/v1._HistoryService_StartWorkflowExecution_Handler
/home/builder/temporal/api/historyservice/v1/service.pb.go:1048
google.golang.org/grpc.(*Server).processUnaryRPC
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1340
google.golang.org/grpc.(*Server).handleStream
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1713
google.golang.org/grpc.(*Server).serveStreams.func1.2
/go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:965
And the temporal-worker
’s stack trace is the following:
go.temporal.io/server/common/log.(*zapLogger).Error
/home/builder/temporal/common/log/zap_logger.go:144
go.temporal.io/server/service/worker/scanner/history.(*Scavenger).handleTask
/home/builder/temporal/service/worker/scanner/history/scavenger.go:284
go.temporal.io/server/service/worker/scanner/history.(*Scavenger).taskWorker
/home/builder/temporal/service/worker/scanner/history/scavenger.go:214
No errors were found in the temporal-frontend
part during the time when the error occurred.
The execution of a workflow roughly happens this way:
opts := client.StartWorkflowOptions{
ID: uuid.NewString(),
WorkflowTaskTimeout: time.Second * 30,
TaskQueue: "tx_retry_task_queue",
}
ctxWithTimeout, cancel := context.WithTimeout(ctx, time.Minute)
defer cancel()
_, err := rs.temporal.ExecuteWorkflow(ctxWithTimeout, opts, temporal.RetryTransactionWorkflowV2Name, tx)
if err != nil {
// error handling
}
I’m running all of this within a Kubernetes environment. Could this issue be related to networking problems? May it be possible that the 10-second timeout is derived from the gRPC maximum timeout setting? Would it be possible to workaround the issue by using a RetryPolicy
with some retries when executing a workflow?