Inconsistent workflow behavior under lack of resources on highload

Sometimes where there’s a lot of RPS like 10 000 workflow starts to repeat already completed steps, which in turn lead to activity failures, which start to retry until retry attempts exceed and workflow fails.

Most importantly this doesn’t happen locally, only on our load testing stand when we are testing our server with 10 000 RPS. Can you give any hints what to look for? Can’t share the code as I’m not allowed to do so:(

workflow starts to repeat already completed steps

Can you elaborate on this, are these activity retries? Could you share your workflow history for a workflow where you think this is happening?
Any service / db error logs you are getting that could share?

It’s all ok, the code was just bad(