I see "maximum attempts exceeded to update history" often in front end logs any idea what could be going wrong?

{“level”:“error”,“ts”:“2021-02-02T06:32:34.640Z”,“msg”:“Unknown error”,“service”:“frontend”,“error”:“maximum attempts exceeded to update history”,“logging-call-at”:“workflowHandler.go:3454”,“stacktrace”:“go.temporal.io/server/common/log/loggerimpl.(*loggerImpl).Error\n\t/temporal/common/log/loggerimpl/logger.go:138\ngo.temporal.io/server/service/frontend.(*WorkflowHandler).error\n\t/temporal/service/frontend/workflowHandler.go:3454\ngo.temporal.io/server/service/frontend.(*WorkflowHandler).RespondActivityTaskCompleted\n\t/temporal/service/frontend/workflowHandler.go:1452\ngo.temporal.io/server/service/frontend.(*DCRedirectionHandlerImpl).RespondActivityTaskCompleted.func2\n\t/temporal/service/frontend/dcRedirectionHandler.go:816\ngo.temporal.io/server/service/frontend.(*NoopRedirectionPolicy).WithNamespaceIDRedirect\n\t/temporal/service/frontend/dcRedirectionPolicy.go:111\ngo.temporal.io/server/service/frontend.(*DCRedirectionHandlerImpl).RespondActivityTaskCompleted\n\t/temporal/service/frontend/dcRedirectionHandler.go:812\ngo.temporal.io/api/workflowservice/v1._WorkflowService_RespondActivityTaskCompleted_Handler.func1\n\t/go/pkg/mod/go.temporal.io/api@v1.4.0/workflowservice/v1/service.pb.go:1191\ngo.temporal.io/server/common/authorization.(*interceptor).Interceptor\n\t/temporal/common/authorization/interceptor.go:136\ngoogle.golang.org/grpc.getChainUnaryHandler.func1\n\t/go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:1051\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor\n\t/temporal/common/rpc/grpc.go:100\ngoogle.golang.org/grpc.chainUnaryServerInterceptors.func1\n\t/go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:1037\ngo.temporal.io/api/workflowservice/v1._WorkflowService_RespondActivityTaskCompleted_Handler\n\t/go/pkg/mod/go.temporal.io/api@v1.4.0/workflowservice/v1/service.pb.go:1193\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:1210\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:1533\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.34.0/server.go:871”}

That error typically means either Temporal history server is busy or underlying DB is busy. Plz check DB usage.

@Wenquan_Xing I am also facing this issue a lot in production and lot of activities and workflows are failing.
I checked history replica memory CPU its very low. I have 2000 shards and 40 history replicas.
mysql database CPU is under 10% usage.
mysql database has 16vCPUs

not sure what else I should check.
Please help. Thank you so much.

Does anyone have any info on this? Our production environment is impacted right now by this error, and there appears to be no documentation on what to do to fix it.

@Wenquan_Xing ?

Hello @Gordon_Bean I am interested if you found what was your issue and how you solve this

can you check the history service logs?

history service logs should contain more details

Are you able to fix this problem? @madhu am also facing this issue but not able to find the root cause.