Cassandra history_node table keeps growing

ravikiran · May 8, 2023, 3:22pm

hi,

Cassandra history_node table keeps growing. We have set namespace retention of 7 days.
We observe this on both server versions 1.9.2 and 1.20.1

Did check this related thread.

Since we have configured, Cassandra as persistence store, we should not have encountered this issue? Or are we missing anything?

Appreciate any help.

ravikiran · May 8, 2023, 3:37pm

Output of tctl commands

/etc/temporal $ tctl adm cl d | jq .persistenceStore
“cassandra”
/etc/temporal $ tctl --ns samples-namespace n desc
Name: samples-namespace
Id: 5a369311-26c0-4533-9b1d-8f54d23298b8
Description:
OwnerEmail:
NamespaceData: map[string]string(nil)
State: Registered
Retention: 168h0m0s
ActiveClusterName: active
Clusters: active
HistoryArchivalState: Disabled
IsGlobalNamespace: false
FailoverVersion: 0
FailoverHistory:
VisibilityArchivalState: Disabled
Bad binaries to reset:
±----------------±---------±-----------±-------+
| BINARY CHECKSUM | OPERATOR | START TIME | REASON |
±----------------±---------±-----------±-------+
±----------------±---------±-----------±-------+

ravikiran · May 9, 2023, 6:03pm

@tihomir @maxim any suggestions here?

maxim · May 13, 2023, 4:25pm

Did it stop growing after 7 days? Is Cassandra’s compaction configured correctly?

ravikiran · May 15, 2023, 10:42am

Did it stop growing after 7 days?

No. Its growing for months now. However, workflows older than 7 days are not visible in UI.

Is Cassandra’s compaction configured correctly?

We used the Schema creation script present in the github repo to create the tables.

CREATE TABLE temporal.history_node ( tree_id uuid, branch_id uuid, node_id bigint, txn_id bigint, data blob, data_encoding text, prev_txn_id bigint, PRIMARY KEY (tree_id, branch_id, node_id, txn_id) ) WITH CLUSTERING ORDER BY (branch_id ASC, node_id ASC, txn_id DESC) AND additional_write_policy = '99p' AND bloom_filter_fp_chance = 0.1 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND cdc = false AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND default_time_to_live = 0 AND extensions = {} AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair = 'BLOCKING' AND speculative_retry = '99p';

Pradeep · May 15, 2023, 12:58pm

@maxim Is the tree_id the workflow id? How can we ensure that completed workflows which are crossed the configured retention period is not in history node table? Ideally the completed workflows after retention should be removed from the history node, isn’t?

ravikiran · May 15, 2023, 2:10pm

Additional info:

mindaugas · May 15, 2023, 6:17pm

Could you check the compaction configuration? See Compaction | Apache Cassandra Documentation for more information.

Also, you can try forcing compaction manually with nodetool compact

From the issue description, it sounds like autocompaction is not enabled or the gc_grace_seconds period is too large.

Btw, compaction requires ~2x disk size. So in your case it might require ~6T free disk space. Otherwise compaction might fail.

ravikiran · May 16, 2023, 2:15pm

Compaction is configured and running correctly. There is 3x disk space, so no reason for compaction to fail.

How often delete statement is executed against the history_node table. temporal/history_store.go at 52b03657479941f60592163c0f2284a742d0fc84 · temporalio/temporal · GitHub

Is this part of scavenger workflow? Is there a known schedule for scavenger worfklows? Can it be configurable?

Michael_Snowden · May 16, 2023, 3:50pm

What’s your dynamic config value for “worker.historyScannerEnabled”?

ravikiran · May 16, 2023, 4:42pm

We haven’t set this. It is default value.

ravikiran · May 17, 2023, 12:22pm

hi @maxim ,

We found records in history_node table which does not have corresponding records in executions view.

Queries:

Since we cannot find execution history for this run_id, should’nt this record be deleted in history_node table by scavenger workflow?
None of the other tables have crossed 50 GB in size but history_node seems to have ~7 TB in size.

@maxim pls help support and suggest course of action.

maxim · May 17, 2023, 4:58pm

Did you use reset command?

ravikiran · May 17, 2023, 5:16pm

no, we did not.

ravikiran · May 19, 2023, 5:03am

@maxim is this an issue in Cassandra persistence of Temporal?

maxim · May 20, 2023, 3:32pm

I"m not aware of such an issue, as we run many clusters that use Cassandra for persistence.

ravikiran · May 22, 2023, 2:38pm

Could this be the reason?

Leveled compaction strategy does a better job of trying to keep data for a partition in a limited range of sstables, but if you wrote data some time ago and it’s aged into higher tiers and then come along later and do the delete, it can take a good deal of time for the delete to make it’s way up into the higher leveled tiers. This is the inherent problem with issuing deletes after the fact expecting to free up disk space. Issuing the initial write and any subsequent update with a TTL reduces the issue as the tombstone after the TTL has elapse is the original record as well so you avoid that nasty issue of having to wait for stable 1 to be able to compact with sstable 100

Our DBA’s are suggesting we alter the compaction’s subproperties especially the tombstone_threshold from 0.2 to 0.05 (Compaction subproperties)
@maxim pls let us know your thoughts on this.

maxim · May 22, 2023, 3:57pm

Unfortunately, I’m not an expert on Cassandra’s compaction strategies.

Topic		Replies	Views
History_node keeps growing Community Support postgresql	12	2053	January 16, 2023
Data retention in Cassandra Community Support	1	1587	December 15, 2020
Understanding GetAllHistoryTreeBranches Community Support go-sdk , scylla	2	456	January 22, 2023
Namespace Deleted Community Support namespace	2	601	May 6, 2021
Optimize history records for a workflow with all local activities Community Support java-sdk , cassandra	18	1905	September 2, 2021

Cassandra history_node table keeps growing

Related topics