After 1.17.1 upgrade we start seeing a lot of Cassandra timeouts which were not there for the 1.16.0

Andrey_Dubnik · July 18, 2022, 3:21pm

Hi,

We have upgraded to 1.17.1 and star seeing a lot of timeouts like below. After we bounce the History it works for few minutes and then goes back to the timeout.

I wonder if we need/can to tune any configuration parameter as 1.17.1 does not seem to be functioning?

{"level":"warn","ts":"2022-07-18T15:16:24.595Z","msg":"Processor unable to retrieve tasks","shard-id":1206,"address":"10.1.6.24:7234","component":"transfer-queue-processor","cluster-name":"dev-westeurope-01-secondary","error":"operation GetTransferTasks encountered gocql: no response received from cassandra within timeout period","logging-call-at":"queueProcessor.go:238"}

Andrey_Dubnik · July 18, 2022, 3:57pm

We also have XDC configured and similar pattern is present on both clusters.

Andrey_Dubnik · July 18, 2022, 4:26pm

our CPU limit was set to 500m (this is our dev cluster so resources are minimised) which was enough for 1.16 but it seems 1.17 consumes more CPU and this is not enough for the history to sustain the idle workload…

We have increased the CPU to 2 and observing relatively stable behaviour.

Topic		Replies	Views
Getting the intermittent "no response received from cassandra within timeout period" Community Support cassandra	1	1171	March 11, 2022
Errors on Temporal History Server Community Support history	3	354	July 4, 2023
After upgrading to 1.5.1 the query and stack trace in web ui stopped appearing Community Support web-ui , upgrading	3	532	November 8, 2021
"Too many pending tasks" in log Community Support go-sdk	3	847	November 30, 2020
Gocql: no hosts available in the pool Community Support cassandra , cadence	5	2532	May 2, 2023

After 1.17.1 upgrade we start seeing a lot of Cassandra timeouts which were not there for the 1.16.0

Related Topics