XDC Replication in Practice

I have confirmed that the initialFailoverVersion values have to remain the same for each cluster. Once I got that setting correct on both clusters, failover started working as expected:

So, here’s my good config (only change from previous post is the initialFailoverVersion is now consistent):

Temporal Config In East 1

enableGlobalNamespace: true
failoverVersionIncrement: 10
currentClusterName: temporaltest-east1
masterClusterName: temporaltest-east1
clusterInformation:
  temporaltest-east1:
    enabled: true
    initialFailoverVersion: 1
    rpcAddress: temporaltest-frontend:443
    rpcName: temporaltest-frontend-east1
  temporaltest-east2:
    enabled: true
    initialFailoverVersion: 2
    rpcAddress: dns:///temporaltest-frontend-east2.mydomain.com:443
    rpcName: temporaltest-frontend-east2

Temporal Config In East 2

enableGlobalNamespace: true
failoverVersionIncrement: 10
currentClusterName: temporaltest-east2
masterClusterName: temporaltest-east2
clusterInformation:
  temporaltest-east2:
    enabled: true
    initialFailoverVersion: 2
    rpcAddress: temporaltest-frontend:443
    rpcName: temporaltest-frontend-east2
  temporaltest-east1:
    enabled: true
    initialFailoverVersion: 1
    rpcAddress: dns:///temporaltest-frontend-east1.mydomain.com:443
    rpcName: temporaltest-frontend-east1
1 Like

Hi @RobZienert, can I ask you a few questions about your experience with XDC replication?

  1. All WorkerFactory objects are polling by default, so if we as Temporal operators were to failover Temporal itself, applications would continue to process work without skipping a beat.

Does this mean that failover from one cluster to another is a manual process, or at least doesn’t come “out of the box” with XDC?

  1. The docs specify that the data between clusters is not strongly consistent. Has this caused any issues for you?

  2. I see that Cadence has an “api-forwarding” feature that lets passive clusters forward signal/start workflow requests to the active cluster. Did you figure out if Temporal supports this? Refs: config argument: dcRedirectionPolicy. I want to know some details about this argument. · uber/cadence · Discussion #4530 · GitHub, Cross DC replication | Cadence

Hi @nathan_shields, happy to answer!

Does this mean that failover from one cluster to another is a manual process, or at least doesn’t come “out of the box” with XDC?

Failover is manually initiated via tctl, for example:

tctl --ns default n up --ac useast1

There are various ways you can approach making the application zero-effort on failovers, but out of the box, the SDKs don’t yet “support” hands-off failovers. Our specific approach makes individual application processes fatter by creating workers that connect to each Temporal Cluster. We have an XdcClusterSupport class that polls the Temporal API to figure out which cluster is active for the namespaces that the application is interested in, then it automatically enables/disables polling on workers as needed so we’re not generating unnecessary load on inactive clusters. This specific behavior is something we’ve considered building a global Temporal API proxy for so that applications don’t need to concern themselves with (they’d just connect to “the” Temporal API, which would route requests appropriately).

However, from an application owner’s perspective, they don’t need to do anything during a failover – our extension SDK makes failovers entirely hands-off for them.

The docs specify that the data between clusters is not strongly consistent. Has this caused any issues for you?

It has, yes, but IIRC only specific to Child Workflow usage. Replication is performed independently at the shard level and children workflows are not necessarily going to be in the same shard as the parent. Therefore, it’s possible that a child’s history is replicated sooner than the parent, so during a failover you can experience “child workflow execution already exists” exceptions because the parent history hasn’t been replicated yet to know it already started the execution.

Your workflow code must consider this potential edge case. Following our intent to make failovers as hands-off as possible for application developers, we have a utility Dispatcher class that Workflows can use that handle the various error conditions associated with using Child Workflows.

It’s my understanding that the Temporal team is planning to make behaviors around replication a little safer without building up the utilities that we have done.

I see that Cadence has an “api-forwarding” feature that lets passive clusters forward signal/start workflow requests to the active cluster. Did you figure out if Temporal supports this?

Temporal does support this and it should definitely be enabled in an XDC topology. While it behaves as advertised, it is an incomplete solution for all failure modes. Specifically, this functionality is not going to help you if the Temporal inactive cluster is unavailable for whatever reason (e.g. the cluster is physically unavailable). That’s why we built up behavior that does not solely depend on it – an application owner should not be impacted by a cluster disappearing entirely for a period of time (like if we need to upgrade our SQL server).

Hope that helps, happy to answer more / clarify as needed.

1 Like

Question around metrics, is there a metric which i can use to track/findout if all the xdc clusters are in sync etc?

You can use a metric operation: ReplicatorQueueProcessor name: replication_tasks_lag target_cluster: to track the remote cluster and local cluster replication backlog. This metrics only give you an idea if there is a large backlog.

You can use standby task queue latency to indicate lag between clusters. operation:transferstandbyqueueprocessor name: task_latency_queue

1 Like

Do you have to enable api-forwarding in configuration or it is automatically enabled by default after you setup xDC as described here:

Always trust the official documentation! It was not automatically enabled when I originally wrote this. :slight_smile: