Clarifications on Archival and Visibility

Karthick · July 14, 2022, 5:24am

We have set up our temporal instance using helm charts. Our set up today has primary db, visibility db and Elastic Search. Have some questions on data getting into visibility and archival of same.

My understanding of archival so far is, when a Workflow is completed, its marked for archival with the configured retention period and is archived by that time.

Assuming in current set up of primary db, visibiliy db and Elastic Search (and visibility data being written to both DB and ES), when the archival time reaches - may I know where all do we delete the data from? Is it delete from all 3 stores above? If so, does the data from all 3 stores gets archived to configured S3 / filestore (or) only data from visibility get into S3
Once the date is moved to archival (lets say S3) - my understanding is that we will not be able to query that data via tctl / UI right?
Clarification on how data gets into visibility - each node has a queue where the visibility tasks are written which then asynchronously read and pushed to the visibility db (or) ES?

tihomir · July 14, 2022, 3:55pm

when a Workflow is completed, its marked for archival with the configured retention period and is archived by that time.

Yes, if it completes, or fails, times out etc (is no longer running).

Currently with archival wf execution visibility records are stored right after the execution completes (you can see them in the web ui under Archival right after). Workflow history is stored after retention period. To make sure you have all the data archived you should query it after the retention period.
Yes it should be removed from all data stores you have configured where data is being written into.
Yes you would be able to query it via tctl, see

tctl wf listarchived -h

for more info. Note that the visibility queries for this are limited, see here and here for the two out of box providers (visibility query syntax sections).

Visibility data is eventually consistent, there is typically a 2-3 seconds latency.

You can use the server “task_latency_queue_bucket” metric to measure visibility task end-to-end latencies, sample query:

histogram_quantile($percentile, sum(rate(task_latency_queue_bucket{operation=~“VisibilityTask.*“}[1m])) by (operation, le))

If you have Elasticsearch enabled you can use “visibility_persistence_latency_bucket” metric to track latencies of pushing data to ES:

sum(rate(visibility_persistence_latency_bucket{visibility_type=“advanced_visibility”}[1m])) by (operation, le))

Archival is an experimental feature and it does contain some quirks and edge cases, just fyi. Would recommend search other forum posts here to see what kind of issues users were running into.

Topic		Replies	Views
Questions on archival feature as of 0.30 Community Support elasticsearch , archival	3	1362	September 25, 2020
Event-store vs. Visibility-store -- questions Community Support	1	1167	January 10, 2021
Can you view archived workflows(stored on s3) via tctl even after a complete DB wipeout+restart? Community Support archival	1	357	May 25, 2023
Elasticsearch contains incomplete data Community Support elasticsearch	7	1452	January 30, 2022
Temporal Archive Retention period update Community Support archival	11	2179	May 5, 2022

Clarifications on Archival and Visibility

Related topics