I understand the reason behind the retention and archiving of completed workflows but can’t we have retention period config time based on the user needs? If the temporal runs on the scalable data source like Cassandra, do the user still need to constraint themselves with 30 day archival max limit? Is this because there could be performance issues with Cassandra even with no cluster size restriction?
As of now, If we choose to keep the data as long as we need one option to keep the workflow open, is it? Can you please suggest the best way to do so?
Certain race conditions with archival might lead to data loss with retention over 30 days. We are planning work to fix the archival implementation which in turn would allow unlimited (up to DB capacity) retention.
Can you please elobarate how is the race condition kicks in only when the archival day is more than 30? Please point me to the issue tracker if already available.
As of now this force us to use additional DB to store domain state and not rely only on the temporal database as we will lose the data in 30 days. Should we consider keeping the workflows open to retain the data?
I hope this this restriction is now removed (as I see in the issue) and there’s no race condition as well when Archival is enabled and retention is > 30 days.
Just to add, yes restriction is lifted in OSS server version 1.17.4, meaning you can set ns retention to arbitrary large value (in days). The default namespace retention (2 days) is still set on newly created namespaces and you could update it via tctl, for example
tctl --ns <ns_name> n u --rd X
Where X is in days. For archival, once you update ns retention it would apply for new executions, already completed execs would still have the retention period that applied when they completed.