Frontend error: "shard status unknown"

Dhiraj_Bhakta · January 17, 2023, 9:31am

Getting a lot of these errors on frontend service while stress testing via maru.

{"level":"info","ts":"2023-01-17T09:28:02.626Z","msg":"history client encountered error","service":"frontend","error":"shard status unknown","service-error-type":"serviceerror.Unavailable","logging-call-at":"metric_client.go:90"}

Along with similar errors which say timeout occurred during StartWorkflowTask etc.
Does this indicate history nodes need to be scaled up? I can see workflows being executed despite these errors…

tihomir · January 17, 2023, 1:23pm

Would check stability of your history hosts during load test. Server has a restarts counter metric that you could look at as well as service_errors_resource_exhausted that you can filter by operations (RpsLimit, ConcurrentLimit, SystemOverloaded). Monitoring you history service CPU utilization would be good as well to know if you are giving your history hosts enough resources.

Error is typically due to shard(s) getting unloaded (one reason could be a history host goes down) and then having to be rebalanced across other available history hosts until a new shard owner is determined.

I can see workflows being executed despite these errors…

Yeah this should in most cases not affect your workflow executions (they would be able to make progress once a new shard owner is determined) but you could see increased latencies in persistence during that time.

Topic		Replies	Views
Operation updateShard encounter timeout Community Support history	9	916	June 28, 2022
How to investigate or solve occasional shard operations causing ResourceExhausted errors Server Deployment	2	124	January 14, 2025
Temporal history/matching service tuning Community Support history	1	1092	October 11, 2023
Disabling archival not working Community Support archival	1	812	June 11, 2021
Temporal Sever errors ; workflow failures and all request to history client failed Community Support	6	1189	May 8, 2023

Frontend error: "shard status unknown"

Related topics