What’s the memory utilization look like on your history pods? Shards are distributed across your history pods and you typically don’t want more than 1K per history host, so would increase replicas if possible.
For server metrics to look into, would start with info in this forum post, note you also need to look at sdk metrics (see forum post here for more info) because performance tuning involves both your server as well as workers that you deploy and run your code.
Hope this gets you started in right direction.