Missing indexes on Postgresql schemas?

I’m testing a helm deployment using a postgresql instance as persistance (schema I used: temporal/schema.sql at master · temporalio/temporal · GitHub).

Doing some analysis on stress testing results, I came across heavy load over the table history_node.
The queries with lots of executions and heavy IO were:

  • INSERT INTO history_node (shard_id, tree_id, branch_id, node_id, prev_txn_id, txn_id, data, data_encoding) VALUES (?) ON CONFLICT (shard_id, tree_id, branch_id, node_id, txn_id) DO UPDATE SET prev_txn_id=$9, data=$10, data_encoding=$11
  • DELETE FROM history_node WHERE shard_id = $1 AND tree_id = $2 AND branch_id = $3 AND node_id >= $4
  • SELECT node_id, prev_txn_id, txn_id, data, data_encoding FROM history_node WHERE shard_id = $1 AND tree_id = $2 AND branch_id = $3 AND ((node_id = $4 AND txn_id > $5) OR node_id > $6) AND node_id < $7 ORDER BY shard_id, tree_id, branch_id, node_id, txn_id LIMIT $8

From my simple logic, creating an index with the corresponding values (shard_id, three_id, branch_id, node_id and txn_id) can help to get rid some of the inconvenience, or am I missing something?

Thanks!

Yes history_node table stores all of the event history meaning any workflow updates and events are going to be stored/updated in this table.

Regarding the index creation, would you mind opening a feature request in the server repo for this please?

Done! issue #3222

Thanks!

BTW, I have applied indexes, now those queries seems to run better than before. As far as I can see, no big impact on inserts.

Just to drop a final statement. All those queries do use the primary key already created. There is no need to create additional indexes. The tool I was using to analyze database performance was buggy.