Errors when deleting executions visibility

fivos · December 13, 2023, 1:52pm

We have Temporal deployed using the version 1.20.3 of the helm chart. We are using MySQL 8 as our DB, on AWS Aurora. We have the cluster set up to use 128 shards.

The Temporal cluster has been up and running for a few months and working mostly with no issues. A couple of times we’ve run into this issue where deleting execution visibility fails. When this happens we get flooded with error logs from the temporal-history service. Other than these errors and older workflow executions failing to be deleting, the cluster seems to be working fine and workflows continue to get executed as expected.

A few more details about these errors.

From what I can gather we seem to be hitting some issue at the database level. We see a bunch of
START TRANSCATION followed by ROLLBACK statements.

Any insights into what might be causing this and how can possibly resolve? The only way we’ve been able to resolve is to completely reset the cluster. We’ve run into this 2-3 times and each time it has been unclear why this has showed up, after the cluster has been running with no issues for a while and with no change in load.

Topic		Replies	Views
High volume of context deadline exceeded from visiblity_manager_metrics.go Community Support	1	222	December 14, 2024
The status in executions_visibility table updated failed Community Support go-sdk , mysql , web-ui	9	965	November 7, 2024
Architectural understanding with data persistence for visibility with Postgres and ES Developer Corner general-impl	2	280	April 15, 2024
"Workflow execution history not found" errors Server Deployment	2	640	September 10, 2024
Getting error on deleting workflows while using astraDB serverless with temporal Community Support cassandra	7	771	July 31, 2022

Errors when deleting executions visibility

Related topics