I’m new here. In a certain field of business, the data in a single table has exceeded 5 million, and a timeout occurs when filtering data for a certain query condition. Is it possible to sharding the database when there are too many task results? Sorry for my Bad English
If you are talking about numHistoryShards config property, then no, its not a dynamic value and cannot be changed once set for a cluster.
Finding the right value of this property for your use case and expected load would require load testing and experimentation. Here is a good writeup that could help, and here are some base recommendations.
You have to use ElasticSearch for list queries.
sorry, maybe I didn’t express it correctly. I use ElasticSearch for list queries. But when workflow has exceeded one million, and a timeout occurs when filtering data for a certain query condition. When there are too many workflows, it is easy to time out when looking up a table. I think the results can be divided into tables according to the hash to avoid query pressure when there is too much workflow data.
Would you explain what exactly you are doing when " timeout occurs when filtering data for a certain query condition."?
Just searching for data, but the results come back very slowly
I plan to encapsulate temporal, adapt to certain services, and then open it to users, but the user’s operations are very frequent, which will generate a large number of workflows. If the execution results of the workflows are stored in a table, it will cause The amount of data in the table is too large, causing query pressure。So I want to know if it is possible to sharding the database when there are too many task results?
5 million workflows is not a large number for Temporal.
Do you use Temporal API/CLI/UI to search for the data? I"m not aware of slow performance when a well-configured Elastic Search cluster is used for searching.