Does overall history size affect performance?

Hi friends!

  1. How much the overall history size affects performance. 50M+ completed processes will be stored for the possibility of searching through them, sorting. Each of these processes contains no more than 100-200 history events. To understand the context: our process is a user action in the bot (step). This history must be stored because various other logic depends on it, for example, send a webhook if there is a previously completed process with such a search parameter and a user ID in the process name. This history is also required to sort the view: show me all users with such and such completed processes (steps) + such and such a search parameter. As an option, store (essentially duplicate) such information in an external database, obliging the process to set the appropriate flags there, etc. But if the overall size of the history does not significantly slow down the speed of the search API, then perhaps an external database is redundant?
  2. Policies for cleaning up completed processes - can they be configured in such a way that some processes would be stored for a year, and others for a week?
  3. Support for scylladb never appeared?

Thank you so much for your work! You have a great product!

Hi @Alexander

  1. I believe you should be using Search Attributes for these types of requirements rather than workflow history.
  2. Temporal allows you to set a retention period per namespace for completed workflows. The default retention period is 2 days and min is 1 day and max 30. For closed workflows that hit the retention period both history and visibility data are removed. If you need your workflow histories and visibility data for closed workflow beyond the max 30 days you would need to look into setting up Archival.
  3. There is no official support, but there are users using it, see for example this post.
1 Like