Streaming error Logs in Multi-Cluster Temporal Server Setup

We are currently running a self-hosted multi-cluster Temporal server setup with replication enabled between the clusters. While workflows are being replicated successfully, we are encountering numerous error logs in the following categories:

  1. service/history/replication/stream_receiver.go:232 - ReplicationStreamError: StreamReceiver exit recv loop
  2. service/history/replication/stream_receiver.go:202 - ReplicationStreamError: StreamReceiver exit send loop
  3. service/history/replication/stream_sender.go:198 - ReplicationStreamError: StreamSender failed to receive
  4. service/history/replication/bi_direction_stream.go:197 - BiDirectionStream encountered unexpected error, closing: *serviceerror.Unavailable closing transport due to: connection error: desc = “error reading from server: EOF”, received prior goaway: code: NO_ERROR, debug data: “max_age”

We are using Temporal server version 1.25.0.

Could anyone provide insights into the possible reasons behind these errors? What impact might they have on our system, and how can we resolve them?

Thank you for your support and guidance.

@maxim @tihomir - Can you please help us on understanding the reasons for these error logs being generated and impact of them!

@Rajendra_Prasad_G, I think you might be interested in Log errors when enabling history.enableReplicationStream.

My understanding is that replication works (I did verify that) and that the “errors” you are seeing are basically expected. They should just not be emitted as errors. Apparently logging in this area has been improved in 1.26, but I have not confirmed this myself.

On our side we are using history.enableReplicationStream: false and use the older version of replication until we can upgrade to 1.26.

–Hardy

Thank you @hferentschik , we have noticed that, these logs are suppressed in later versions. So we are planning to upgrade to latest version soon to avoid them.