Maru Spike Test Corrupts AWS RDS Postgresql DB (Context Deadline Exceeded)

Cameron_Ward · July 22, 2022, 9:22am

Hi all, I may have found a bug with temporal when running heavy workloads.
Here’s a link to the report I made: https://github.com/temporalio/temporal/issues/3131

Does anyone else have any issues running heavy workloads that can corrupt the database? After the Maru test fails I can no longer run any workflows on temporal until I destroy and re-deploy the database.

Thanks,
Cameron

Rob_Temporal · July 22, 2022, 12:17pm

Just to note for the team I’m looking into this.

Cameron_Ward · July 26, 2022, 2:10pm

I can confirm that removing the Fargate profile, CNI plugin and using On-Demand as the capacity type for nodes resolved the issue for AWS EKS. Before, Fargate seemed to be tainting the pods and putting them all onto one node which was running out of resources. I also had to adjust the security profile of the node group to allow all traffic between them. Thanks @Rob_Temporal for all the help

Topic		Replies	Views
Database CPU spike when multiple workflows are triggered simultaneously Community Support go-sdk , aws , postgresql	1	34	March 25, 2025
Temporal Node Resource usage is very low. But some pods of services keep restarting Community Support general-impl	3	58	January 14, 2025
Temporal + Aurora Mysql Performance Community Support performance	1	1422	July 12, 2021
Workflow backlog while running maru 12k test in kubernetes cluster Community Support	3	695	March 3, 2023
Temporal with Yugabyte db Community Support java-sdk	10	1278	January 4, 2022

Maru Spike Test Corrupts AWS RDS Postgresql DB (Context Deadline Exceeded)

Related topics