The status in executions_visibility table updated failed

My temporal cluster have two server nodes.I guess this promblem only appear in multi server nodes scene.
I use samples-go/cron to test the two server nodes temporal cluster.I find some workflow executions didn’t update their status in executions_visibility table after they completed.In the webUI,they still diplayed running.


Actual they were completed.

In the mysql database,I found the status was 1 and the completed workflow executions must be 2.
Is there any solution?


Actually it was completed.


In the mysql database, the status in table executions_visibility was 1.

The same problem with Temporal UI not displaying correct status for a workflow

Whats the Temporal server version you are using? Can you give more info on your multi-cluster setup?

I assume you configure standard visibility only. If you have server metrics enabled and are scraping them could you share your visibility latencies graph?

histogram_quantile(0.95, sum(rate(task_latency_bucket{operation=~"VisibilityTask.*", service_name="history"}[1m])) by (operation, le))

also check the execution status for this via tctl using:


tctl wf desc -w <wfid>

The server version is 1.21.0.
Here is the multi-cluster setup, I use some variables to instead the real ip.Olny the ${nodeIP} is different between cluster nodes.

log:
  stdout: false
  level: info
  outputFile: "/tmp/temporal-server.log"

persistence:
  defaultStore: mysql-default
  visibilityStore: mysql-visibility
  numHistoryShards: 2048
  datastores:
    mysql-default:
      sql:
        pluginName: "mysql"
        databaseName: "temporal"
        connectAddr: "${mysqlAddr}
        connectProtocol: "tcp"
        connectAttributes:
          tx_isolation: 'READ-COMMITTED'
        user: "didi_2BYe"
        password: "yHz8aY6Hq"
        maxConns: 20
        maxIdleConns: 20
        maxConnLifetime: "1h"
    mysql-visibility:
      sql:
        pluginName: "mysql"
        databaseName: "temporal_visibility"
        connectAddr: "${mysqlAddr}"
        connectProtocol: "tcp"
        connectAttributes:
          tx_isolation: 'READ-COMMITTED'
        user: "didi_2BYe"
        password: "yHz8aY6Hq"
        maxConns: 2
        maxIdleConns: 2
        maxConnLifetime: "1h"

global:
  membership:
    maxJoinDuration: 30s
    broadcastAddress: "${nodeIP}"
  pprof:
    port: 7936
  metrics:
    prometheus:
#      # specify framework to use new approach for initializing metrics and/or use opentelemetry
#      framework: "opentelemetry"
      framework: "tally"
      timerType: "histogram"
      listenAddress: "127.0.0.1:8000"

services:
  frontend:
    rpc:
      grpcPort: 7233
      membershipPort: 6933
      #bindOnLocalHost: true
      bindOnIP: ${nodeIP}

  matching:
    rpc:
      grpcPort: 7235
      membershipPort: 6935
      #bindOnLocalHost: true
      bindOnIP: ${nodeIP}

  history:
    rpc:
      grpcPort: 7234
      membershipPort: 6934
      #bindOnLocalHost: true
      bindOnIP: ${nodeIP}

  worker:
    rpc:
      grpcPort: 7239
      membershipPort: 6939
      #bindOnLocalHost: true
      bindOnIP: ${nodeIP}

clusterMetadata:
  enableGlobalNamespace: false
  failoverVersionIncrement: 10
  masterClusterName: "active"
  currentClusterName: "active"
  clusterInformation:
    active:
      enabled: true
      initialFailoverVersion: 1
      rpcName: "frontend"
      rpcAddress: "${nodeIP}:7233"

publicClient:
  hostPort: "${nodeIP}:7233"

dcRedirectionPolicy:
  policy: "noop"
  toDC: ""

archival:
  history:
    state: "disabled"
    enableRead: false
  visibility:
    state: "disabled"
    enableRead: false

namespaceDefaults:
  archival:
    history:
      state: "disabled"
      #URI: "file:///tmp/temporal_archival/development"
    visibility:
      state: "disabled"
      #URI: "file:///tmp/temporal_vis_archival/development"

dynamicConfigClient:
  filepath: "config/dynamicconfig/development-sql.yaml"
  pollInterval: "10s"

I haven’t collected any metrics yet. I will do it later and show you.

Here is the server metrics you want to see.


The execution status of the run instance is completed.

Is there any progress for this topic? I have the same problem.

I have the same issue as well.
Temporal server version: 1.23.0
Visibility Storage DB: AWS Aurora Postgres

Visibility data can become inconsistent · Issue #5643 · temporalio/temporal · GitHub describes one of the possible reasons.

1 Like