Hi, actually we have temporal.io running on AWS RDS PostgreSQL 13.7, and we are fronting some bottlenecks with history_node table elevating our CPU cost drastically.
If possible, can you guys give me some insights to solve this problem?
Below we have some metrics collected at high load time.
In my case, sometimes, we have more than 40KB per line in history_node, so PostgreSQL uses toast to store data and this is bad on some scenarios.
After 2000B PostgreSQL writes data to toast and compress it.
Cons:
Every time pg have to write full page at wal log even if tuple it is not related chunk.
In my comprehension, if you have a page with 2 chunks of id1 and 2 chunks of id2, and you updated only id1, pg will write id2 chunks at wal log too and because of that generates to much IOWAL WRITE and WALWRITE LOCK.
Cant read data without decompress, even partially (I’m at pg13 and can’t).
Hey @natan.araujo did you find any solution to this?
Even I see this issue. The pattern I see between problematic and non-problematic clusters in our setup is that the pg_statistic and pg_toast tables have too many dead tuples in the problematic clusters. This doesn’t exist in other clusters.