Regular Context Deadline Exceeded error when executing a workflow

algollunism · June 1, 2023, 3:34pm

Hello!

I’ve been encountering the context deadline exceeded error regularly while using the ExecuteWorkflow method. I call it with a custom context.WithTimeout where the timeout is set to 1 minute. However, the error occurs 10 seconds after making the call in all cases.

Additionally, the temporal-history logs the context deadline exceeded error along with the following stack trace:

  go.temporal.io/server/common/log.(*zapLogger).Error
  /home/builder/temporal/common/log/zap_logger.go:144
go.temporal.io/server/service/history/workflow.createWorkflowExecution
  /home/builder/temporal/service/history/workflow/transaction_impl.go:346
go.temporal.io/server/service/history/workflow.(*ContextImpl).CreateWorkflowExecution
  /home/builder/temporal/service/history/workflow/context.go:349
go.temporal.io/server/service/history/api/startworkflow.Invoke
  /home/builder/temporal/service/history/api/startworkflow/api.go:94
go.temporal.io/server/service/history.(*historyEngineImpl).StartWorkflowExecution
  /home/builder/temporal/service/history/historyEngine.go:424
go.temporal.io/server/service/history.(*Handler).StartWorkflowExecution
  /home/builder/temporal/service/history/handler.go:529
go.temporal.io/server/api/historyservice/v1._HistoryService_StartWorkflowExecution_Handler.func1
  /home/builder/temporal/api/historyservice/v1/service.pb.go:1046
go.temporal.io/server/common/rpc/interceptor.(*RetryableInterceptor).Intercept.func1
  /home/builder/temporal/common/rpc/interceptor/retry.go:63
go.temporal.io/server/common/backoff.ThrottleRetryContext
  /home/builder/temporal/common/backoff/retry.go:194
go.temporal.io/server/common/rpc/interceptor.(*RetryableInterceptor).Intercept
  /home/builder/temporal/common/rpc/interceptor/retry.go:67
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1162
go.temporal.io/server/common/rpc/interceptor.(*RateLimitInterceptor).Intercept
  /home/builder/temporal/common/rpc/interceptor/rate_limit.go:86
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept
  /home/builder/temporal/common/rpc/interceptor/telemetry.go:142
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/metrics.NewServerMetricsTrailerPropagatorInterceptor.func1
  /home/builder/temporal/common/metrics/grpc.go:113
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/metrics.NewServerMetricsContextInjectorInterceptor.func1
  /home/builder/temporal/common/metrics/grpc.go:66
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.UnaryServerInterceptor.func1
  /go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.36.1/interceptor.go:352
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
go.temporal.io/server/common/rpc.ServiceErrorInterceptor
  /home/builder/temporal/common/rpc/grpc.go:137
google.golang.org/grpc.chainUnaryInterceptors.func1.1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1165
google.golang.org/grpc.chainUnaryInterceptors.func1
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1167
go.temporal.io/server/api/historyservice/v1._HistoryService_StartWorkflowExecution_Handler
  /home/builder/temporal/api/historyservice/v1/service.pb.go:1048
google.golang.org/grpc.(*Server).processUnaryRPC
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1340
google.golang.org/grpc.(*Server).handleStream
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:1713
google.golang.org/grpc.(*Server).serveStreams.func1.2
  /go/pkg/mod/google.golang.org/grpc@v1.50.1/server.go:965

And the temporal-worker’s stack trace is the following:

go.temporal.io/server/common/log.(*zapLogger).Error
	/home/builder/temporal/common/log/zap_logger.go:144
go.temporal.io/server/service/worker/scanner/history.(*Scavenger).handleTask
	/home/builder/temporal/service/worker/scanner/history/scavenger.go:284
go.temporal.io/server/service/worker/scanner/history.(*Scavenger).taskWorker
	/home/builder/temporal/service/worker/scanner/history/scavenger.go:214

No errors were found in the temporal-frontend part during the time when the error occurred.

The execution of a workflow roughly happens this way:

opts := client.StartWorkflowOptions{
	ID:                  uuid.NewString(),
	WorkflowTaskTimeout: time.Second * 30,
	TaskQueue:           "tx_retry_task_queue",
}

ctxWithTimeout, cancel := context.WithTimeout(ctx, time.Minute)
defer cancel()

_, err := rs.temporal.ExecuteWorkflow(ctxWithTimeout, opts, temporal.RetryTransactionWorkflowV2Name, tx)
if err != nil {
	// error handling
}

I’m running all of this within a Kubernetes environment. Could this issue be related to networking problems? May it be possible that the 10-second timeout is derived from the gRPC maximum timeout setting? Would it be possible to workaround the issue by using a RetryPolicy with some retries when executing a workflow?

algollunism · June 2, 2023, 9:24am

Temporal Server Version: v1.19.0
Go Temporal SDK Version: v1.21.1

algollunism · June 14, 2023, 9:50am

Will a workflow execution be retried if a context deadline exceeded error occurs, provided that I specify a RetryPolicy? Or do I need a custom retry logic for that?

UPD: custom retrying didn’t help

max · September 19, 2023, 11:17am

I am experiencing a very similar issue to this, would love to hear thoughts on how I could debug this further.

sebin · January 28, 2024, 4:15pm

hi did any one of you find a solution to this. I have had the same issue with java sdk while running in a k8s environment. Found many posts related to load balancer idle time and keep alive time in sdk but still could not get rid of this error.

Topic		Replies	Views
Unable to submit workflow with "context deadline exceeded" error Community Support go-sdk , general-impl	0	67	August 21, 2024
Context deadline exceeded issue Community Support go-sdk	15	6969	November 20, 2024
Trying to create new workflow after upgrading to v0.17.1. Seeing `context deadline exceeded` but no corresponding error logs in frontend container Community Support go-sdk	4	1367	July 16, 2022
Getting context deadline execeeded error Community Support	1	540	March 8, 2021
Context deadline exceeded Community Support go-sdk	1	2255	October 8, 2020

Regular Context Deadline Exceeded error when executing a workflow

Related topics