@tihomir I can add other information that could be helpful.
based on this article: How To Identify And Tune Worker Bottlenecks - #3 by sriramg
and this one: Performance bottlenecks troubleshooting guide | Temporal Platform Documentation
We have raised the max_concurrent_workflow_tasks from 100 to 500. We have seen that the same behavior happened with 500 as well.
In the last graph, we can see that as the workflow workers go down to zero, the activity workers go down a little as well (from 100 to 90).
We verified that the performance of the affected pod at the time of the depletion was very good, in terms of CPU and memory usage.
Do you have any recommendation? Thanks in advance!


