Occasionally workflow task won't be started after scheduled

Temporal v1.17.4
go.temporal.io/sdk v1.15.0

I’m sure there are workers.

When I click the queries, it’s hang.

Found 8 hanging workflows. All the workflows hang on first workflow task.

Hello @qbowater

Can you share the workflow history?

do you see any errors in the workers?

Thanks
Antonio

I can’t upload zip file at here. Please download the workflows history from the google drive link.
@antonio.perez

Database is Alibaba PolarDB-MySQL which full compatible with MySQL 8.0

https://www.alibabacloud.com/help/en/polardb-for-mysql/latest/what-is-polardb#:~:text=PolarDB%20is%20a%20secure%20and,that%20decouples%20computing%20from%20storage.

What does

tctl workflow describe --wf <workflowID> 

for a still open workflow return?

An open workflow which is not hang.

bash-5.1# tctl --ns express wf desc --wid express/listen/01DYSJTP8HRN11RAC467BWPPX2                                                                                                                                                                                                                                 │2TYFokot 0RYHYM7e lCUKB6MA LWSUXCi5 imXHLbS7 8hBOhjUm 4ZKK1l55 Wa4zcCzz
{                                                                                                                                                                                                                                                                                                                   │K3bhgK2F 3Ic65seM mZvfp52C SWy1w9Fu Wd0O7CAE kt1FJzHJ U0T8vOZM 3KZS0WgO
  "executionConfig": {                                                                                                                                                                                                                                                                                              │LDiZfmm9 qgd6YEgM heHg1CCW nElUq3ma 1HoEhV0F Ot2UlVMq y0RuQMbs ae1y2RiL
    "taskQueue": {                                                                                                                                                                                                                                                                                                  │qeKwEF0x OFhwKTu0 AIJS2N9U MFDGGd0e 8gpYvg2L eEC7MaWE iASgre1K Fob5SEw1
      "name": "zeus.api",                                                                                                                                                                                                                                                                                           │b5OA25Gh OXrh0EiN SKRZHm8L wxtjdSq7 v8XoQcKG y1AlkEmy 8nbkKuv7 k66FZsJg
      "kind": "Normal"                                                                                                                                                                                                                                                                                              │ycVH3y3j nVpgphx5 GiQKo3UY kXbN9wl2 M7SfRXEE yBn5aeY0 AyTDGxR4 4XUMizQZ
    },                                                                                                                                                                                                                                                                                                              │❯ pwgen -s 20
    "workflowExecutionTimeout": "0s",                                                                                                                                                                                                                                                                               │MBDdAwZNBkCrdT8iNucZ DUgLr01Z9PhpPiJZGVUo Fl7TPSHmdzteaGS2Tc7l
    "workflowRunTimeout": "0s",                                                                                                                                                                                                                                                                                     │niK49EtIeEelVckbfuAG DQ4rByWTC6TbGMrjhdo4 MNsClA303oLL3GvAMNGX
    "defaultWorkflowTaskTimeout": "10s"                                                                                                                                                                                                                                                                             │6QC18UmoY0rVw99FXj6Z Lxc3FdzoIVXvv3WS08Rp ynhWbdTYgBVn4SCUBAcX
  },                                                                                                                                                                                                                                                                                                                │KWT1WgeU3vQ2SiUXPXek BPuxI91xxrzBR8RbZ3uG 07HUlidmV8AGjIrXqTeA
  "workflowExecutionInfo": {                                                                                                                                                                                                                                                                                        │a6M7eYoNdjtIzzF0R3cp hbjhFjEVGHmxnl1uvifI urU2nfaviO24xYY0GccZ
    "execution": {                                                                                                                                                                                                                                                                                                  │hXgNra3Iz9LUau70EjYD sEwQq0pPOhU2GOaXVoAo Ul5J7AUE7APgtD0doArK
      "workflowId": "express/listen/01DYSJTP8HRN11RAC467BWPPX2",                                                                                                                                                                                                                                                    │A0QKhw7AJdf7gHWMkBfh VMQWzr0EmygeNXHVJfgV RtUtMSJiVcMnCuLsFHQ9
      "runId": "8b08f76b-3740-40d7-95dd-e54db031c5ee"                                                                                                                                                                                                                                                               │QsIRNKPkld4KBcNRyLfn fD3ze10Byzf5zbP7Tmzk wgbz3Z5kFVAvfw4qiBpL
    },                                                                                                                                                                                                                                                                                                              │IS025dIswjqpt6SKF2Ht FZbu5H3cPiZNrnxEoIE4 XC3MYmSyNGuU4vtNhCOS
    "type": {                                                                                                                                                                                                                                                                                                       │gqTltEgrkEB6iIVnPQsn Hqp4CQq6tnd1dTWRQr0u xkVcYRmmDgkBS4nD3EOv
      "name": "express/listen"                                                                                                                                                                                                                                                                                      │J1xNNNXbOIZKKKSZ1IJK 7GOReKhUckASlmyQzNxX iVaIuHmwsGkWIK0C4yzg
    },                                                                                                                                                                                                                                                                                                              │jXoLe1eIPuzGFsixbFOy s82BzZFAQPhllcz7e4ol gROa6XS1nrGbf6ZWNkRv
    "startTime": "2022-11-09T04:39:31.577050591Z",                                                                                                                                                                                                                                                                  │GAHlWdvYDo5V1h0RC7rw b4SApwwKbCn93YwEaHPC 55wIxqcWrdr4bgppucnc
    "status": "Running",                                                                                                                                                                                                                                                                                            │bwp2KWmHfmlNIsRfkR54 fJW9s1LWYg6rPOLOyZjZ f64vO9is8H1MMNhTSUsQ
    "historyLength": "35",                                                                                                                                                                                                                                                                                          │wqgMyd4RXun5piE018e5 3W9yLq739euQdgdBORfx 2RWRCg6O1S6xk3soBiKJ
    "parentNamespaceId": "87b5cb4b-4bae-4faa-9cae-2ecf5bb778c2",                                                                                                                                                                                                                                                    │WcYwvhtM0jJE1KXon7XJ ijglfXkiXGqK9oPuaaqd 8ebj4pFN8BvaVKs7SVKG
    "parentExecution": {                                                                                                                                                                                                                                                                                            │HxbA36rpYyWjEyfKSgRa 0LFXWdILwh2p36QU36rp QcKHoboUu0PIuYXEi3V2
      "workflowId": "express/listen/01DYSJTP8HRN11RAC467BWPPX2/start",                                                                                                                                                                                                                                              │GebB4C8OYZLn9ejaGkVH hV341RVhwlVSp6zDYTJU BKchtcmRlXm0UT3dzVih
      "runId": "a01a8a46-aa0f-4c80-860b-a82b2eaa5cd7"                                                                                                                                                                                                                                                               │asG8Zs9MJVvkkUT4UDTs 0guyTHyRMPpJjE1j4jZ3 PqWBHtWa2jVNgTKVKMBJ
    },                                                                                                                                                                                                                                                                                                              │mlpk0MFera1iqJVj6BI7 6Z4r375s6klf3fKEBpbS 0UHuvteXcLGW0HrKjCjD
    "memo": {                                                                                                                                                                                                                                                                                                       │ ~ 
                                                                                                                                                                                                                                                                                                                    │ ~ 
    },                                                                                                                                                                                                                                                                                                              │                                                                                             16:14:41
    "autoResetPoints": {                                                                                                                                                                                                                                                                                            │ ~ 
      "points": [                                                                                                                                                                                                                                                                                                   │
        {                                                                                                                                                                                                                                                                                                           │ ~ 
          "binaryChecksum": "fe191a7b6ff4749368b4f2b5a7a94ed0",                                                                                                                                                                                                                                                     │
          "runId": "02d4b937-3dca-4130-89e5-b5ab51b63b41",                                                                                                                                                                                                                                                          │ ~ 
          "firstWorkflowTaskCompletedId": "4",                                                                                                                                                                                                                                                                      │ ~                                                                             16:14
          "createTime": "2022-11-09T01:39:36.451280375Z",                                                                                                                                                                                                                                                           │ ~                                                                       16:
          "expireTime": "2022-11-12T02:09:34.426095582Z",                                                                                                                                                                                                                                                           │ ~ 
          "resettable": true                                                                                                                                                                                                                                                                                        │ ~                                                   16:14:
        }                                                                                                                                                                                                                                                                                                           │ ~                                                                                           16:14:41
      ]                                                                                                                                                                                                                                                                                                             │❯
    },                                                                                                                                                                                                                                                                                                              │
    "stateTransitionCount": "18"                                                                                                                                                                                                                                                                                    │
  }                                                                                                                                                                                                                                                                                                                 │
}

An open workflow which is hang.

bash-5.1# tctl --ns express wf desc --wid express/listen/01GBXVJ52BE57FZ2VKF7H9E19H                                                                                                                                                                                                                                 │fie7Utua Chee0chu xooV6Chi puK6ajea maeHohd0 dae2Uivo Uoph1ohw Eekaece7
{                                                                                                                                                                                                                                                                                                                   │oN8Te8hu iepho9IM uz9su4Ae AhDaqu1p ahfat4aD aeC3eixi je5AGeiF yie5Aeth
  "executionConfig": {                                                                                                                                                                                                                                                                                              │boo2ahGh Ahk6Shee kahGe0Ei ebieg9Oh Jeerae9a roowuw7U Wo2wohqu eiShood7
    "taskQueue": {                                                                                                                                                                                                                                                                                                  │EeFunge6 doo2UuRo Oy7Phoht eW6aelae xaec2Ahv phuba7iS Chiimee3 ia9eiPhu
      "name": "zeus.api",                                                                                                                                                                                                                                                                                           │Ohtohco8 eiyeeDo7 Oobu1iex oht7eiF1 Wah6via9 oos2Ar3c eiXi0Gei alo6Eigu
      "kind": "Normal"                                                                                                                                                                                                                                                                                              │oMahru0j Peiyoh8r ihoh6eiX zez1thaZ pie8ahPh Eif9ethi kei0oxoZ foobei1E
    },                                                                                                                                                                                                                                                                                                              │theiw7Ku ohWohph4 eequ2aiX Aiphai3z aRohwae2 eishaV4i heesh7Zu aof0Thah
    "workflowExecutionTimeout": "0s",                                                                                                                                                                                                                                                                               │goo4eKei Aemoi8ae aimaen8H ahweeMe8 eite8Phi Pahgh8ee Xeikipu7 Eimoamu1
    "workflowRunTimeout": "0s",                                                                                                                                                                                                                                                                                     │uo4Teowu quoow0Ei eeFei4ah Mohgoh2i Iez9Gue8 uK6ohqu2 niveiS9g Quugh9mi
    "defaultWorkflowTaskTimeout": "10s"                                                                                                                                                                                                                                                                             │mohGh5Ee Wujie3Oo iav6Eitu aey4PhuR ahzie9Ye pahTo1bi Xai8mah0 xai6Phai
  },                                                                                                                                                                                                                                                                                                                │oom5Oowu iero3Pei aud9Zeex ooC3caew Ieph6eik mu8eiSie Waqu1ohy joh7ooC0
  "workflowExecutionInfo": {                                                                                                                                                                                                                                                                                        │zahvoh3S Eu8thesa beek4aMo ieNgii4j bei6Aviz Iw1oasha KeeThai9 eg3yieGe
    "execution": {                                                                                                                                                                                                                                                                                                  │Goa8piNa Efa3koyu ieFail9g Cieko4oo Mee5nai1 shohqu5E Shee2Fu6 uVahy8Oi
      "workflowId": "express/listen/01GBXVJ52BE57FZ2VKF7H9E19H",                                                                                                                                                                                                                                                    │jahPh4xu ef6gae4O Shi1aico eeRoong5 Bachae6e nien4Qui jil3hooK ieph1feS
      "runId": "1293602d-fd90-453b-bdbd-17e9be9bf84c"                                                                                                                                                                                                                                                               │ohKoo2oo Shegei9e Hein7iez Eenie5au oKa8eeDe re7Xavoa li0aes7M wie3aeDa
    },                                                                                                                                                                                                                                                                                                              │phiePi6r eichie1F yienee2P sheGhe8i Roin5Aey Eig6cha2 Ka6Feeph oBie8Hoh
    "type": {                                                                                                                                                                                                                                                                                                       │thiB2laT Oong4ohj ahQueo2m ieThee7c ej2eeG3r auCae3ai ei8uY0iC OoGie9Qu
      "name": "express/listen"                                                                                                                                                                                                                                                                                      │❯ pwgen 10
    },                                                                                                                                                                                                                                                                                                              │Eig5hai9oo aphiSh7Rao Jahngew6ai eith6EG6ie aerahP1mua iequainoH1 kaaBeem4go
    "startTime": "2022-09-20T23:00:30.537113941Z",                                                                                                                                                                                                                                                                  │yu4zahv6Si moh2Aek2sa Xaena3waen xu9sai8ooF lacooC7Eed eTeiGho9ae ahphooThu3
    "status": "Running",                                                                                                                                                                                                                                                                                            │quooPhai3a joh2Vietie eiRei8shei eepeiCai2j aechi8ieJi juphoa1ooD deiqu2Ohjo
    "historyLength": "6235",                                                                                                                                                                                                                                                                                        │ih8xi1yooK Ohyon4xie1 Aa2ooch4oh thu3Eeregh eeKios6eip puMa0rieDi EL4eifaeb0
    "parentNamespaceId": "87b5cb4b-4bae-4faa-9cae-2ecf5bb778c2",                                                                                                                                                                                                                                                    │eizaive5Oe oti8phooTu chak9ku8Ah fah4mei1He eiV1bush8j oom6oep0Ie ieSoox6eph
    "parentExecution": {                                                                                                                                                                                                                                                                                            │Ugh9PhaMoo Ophoh6do4u eef3Iezei6 Ixaunu1soh aiHiud5neg zaeLieNg5k thaGheiz7a
      "workflowId": "express/listen/01GBXVJ52BE57FZ2VKF7H9E19H/start",                                                                                                                                                                                                                                              │Eichaik0oo Ohsoopu6Ae nei2dohXe6 Heshiequo7 Uuph6Ofea3 eSh8ahK4fa ieKo1ahsho
      "runId": "025b368b-8735-447e-8e72-72cbcabf89ac"                                                                                                                                                                                                                                                               │oquoeMa2ez kaiwu3baiM so1Oochuam OopheiZe2a Cheinei5Ko maxaeSae5n shaeGh7si6
    },                                                                                                                                                                                                                                                                                                              │paeghish3I Sha1geeng8 lu4booGh9w dooLuac7xo Voe9aifuaF EuC0aet1ho kebaaSe8es
    "memo": {                                                                                                                                                                                                                                                                                                       │Pah8Zei7Ba Oseili2Ee9 Osiose0Bai nai5iem5Ta Shajoo2aK4 Aeh0Phei5i ohPh3Oiph1
                                                                                                                                                                                                                                                                                                                    │meiW8oajei phaegh7ieT Eez2ohs3ah neemahsh7C Phe8EipuPo iofa5Upeiz aTuuh2OoXo
    },                                                                                                                                                                                                                                                                                                              │yieTox5ief Xiebol8ail aijuth3eiD bi5eiQu9ee ouquur8Zie Seevoo1eeg Meimei7Hae
    "autoResetPoints": {                                                                                                                                                                                                                                                                                            │Us2weepai4 paiwu3Chuf hi3tei7ooV Ja3zangees uB8OGh5Que AiRu5veith Ieheej6Ka6
      "points": [                                                                                                                                                                                                                                                                                                   │aiPhahtoo4 opaighu6uR raVu0lie6H AhL9Uugh9d Naetu1wahn iepeQuah2j es1Aiw8chu
        {                                                                                                                                                                                                                                                                                                           │phoch5yohB du4EiSichi eiqu2au8Oh chat6hee4O nah7ieteeB Ooy0zeiquu ooDahdah0l
          "binaryChecksum": "1057a0dfabd3f57f8d708c37d55cc59c",                                                                                                                                                                                                                                                     │eef7Eesi5F ThahjahGu1 oi9GeuRiew aet3ahViej Aiw3eelaej iet0Dilaes Roh8uaXu8A
          "runId": "c0714c8a-a9bf-48c6-92cd-ebff8a3b22c8",                                                                                                                                                                                                                                                          │aN5ohchohc ceeCoh9iex bah9aiT8xu oWai8tia3p eer0Aeshir eecae1Vaer iesah0lieC
          "firstWorkflowTaskCompletedId": "4",                                                                                                                                                                                                                                                                      │bah7Feede1 aaNg2queex eim6uoF3Ah ohD0phee4e Hee6jahmae bu8peiQuao Aivae1chox
          "createTime": "2022-09-20T22:01:37.606616194Z",                                                                                                                                                                                                                                                           │maop3Eechu ooga6Ba2qu Maik8quee6 aengu1Aita Hoo8Uu7she ohs8Mo7Aum zoaYa8Meim
          "expireTime": "2022-09-23T22:30:35.570113094Z",                                                                                                                                                                                                                                                           │Our2faelie Iev7gei8oo eequ1ieY3a Ooxeyo9Tai aiWie7fae3 diuVa9eej2 zu0ahGaiph
          "resettable": true                                                                                                                                                                                                                                                                                        │❯ pwgen -s
        }                                                                                                                                                                                                                                                                                                           │VOGH0qEv FNHyC1T1 8xjuf4GK G9pLqu3y JeI8LGPT xpIC1g9Y pa80JXOx cKe3Rspg
      ]                                                                                                                                                                                                                                                                                                             │H5wENbQi t6fTQtSk 30C8z7JM ldssH4Hv 82IsW1aa nGfOp9OQ O84iLzXF Nckuz2i9
    },                                                                                                                                                                                                                                                                                                              │ONtsRg5R W8bKEzrN N63YOJYR fd1ljcWr UFZ52YB1 TrxoBye2 5eNPY1BB 4kplV2r7
    "stateTransitionCount": "6234"                                                                                                                                                                                                                                                                                  │WCwlLX5D iX7Pfopk tRfS6Xzr oWKPe0rI Wev698QZ LLTcGkb6 QjZO4Vr6 uFCu6oh1
  },                                                                                                                                                                                                                                                                                                                │RTdo2OHs KlGkL5qq fmCR3NW4 3WdTsbrA HcP9hF6Q 2UvdNfSR 1sqDLJYF RgiwZD7j
  "pendingWorkflowTask": {                                                                                                                                                                                                                                                                                          │2Y8ahiNn dyIGYYt5 9GIZhOtk 8hmZCphE uxexD8tE hPNva6zf FYN8yVA5 KH2YGmDa
    "state": "Scheduled",                                                                                                                                                                                                                                                                                           │eM1OrKGR wLqeM7D4 3Rud09LM 9VwmxFVv 7373whWu s75eXOI2 Sp2pBf8p ulFL42Zy
    "scheduledTime": "2022-09-20T23:00:30.537151414Z",                                                                                                                                                                                                                                                              │RsVJFAt5 81knScEt 9LAiTOLc Ms9BiSs7 RZ4t4Io9 byCaC37O QV3JrE6F K9wZHFX0
    "originalScheduledTime": "2022-09-20T23:00:30.537150477Z",                                                                                                                                                                                                                                                      │EjhYOXv2 hyA57EjZ hMu3oQzE dV6bQjJy y67t3fTq GTfCEFz1 XSo1bKAk ZvugK7OS
    "attempt": 1                                                                                                                                                                                                                                                                                                    │lDiwn9IU wSFq011r oof6I6Rb y3OIf3Hp IjM0O1I8 iuJF2qYY ZKlZiSs2 t8s8jOkl
  }                                                                                                                                                                                                                                                                                                                 │2itjiAct S30kzSec s8aXWytK tLgJxlv7 HPkzpOb1 lE7uD8JY 6BX5r1CO fbTtLog3
}

Not found meaningful error logs.

When I try to check the query result in the temporal/ui, and refresh the page. Then could found some errors like:

{"level":"error","ts":"2022-11-08T07:27:04.647Z","msg":"unavailable error","operation":"QueryWorkflow","wf-namespace":"express","error":"buffered query cleared, please retry","logging-call-at":"telemetry.go:280","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/home/builder/temporal/common/log/zap_logger.go:142\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).handleError
\n\t/home/builder/temporal/common/rpc/interceptor/telemetry.go:280\ngo.temporal.io/server/common/rpc/interceptor.(*TelemetryInterceptor).Intercept\n\t/home/builder/temporal/common/rpc/interceptor/telemetry.go:144\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1120\ngo.temporal.io/server/common/rpc/interceptor.(*RetryableInterceptor).Intercept.func1
\n\t/home/builder/temporal/common/rpc/interceptor/retry.go:62\ngo.temporal.io/server/common/backoff.ThrottleRetryContext\n\t/home/builder/temporal/common/backoff/retry.go:190\ngo.temporal.io/server/common/rpc/interceptor.(*RetryableInterceptor).Intercept\n\t/home/builder/temporal/common/rpc/interceptor/retry.go:66\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.47
.0/server.go:1120\ngo.temporal.io/server/common/metrics.NewServerMetricsTrailerPropagatorInterceptor.func1\n\t/home/builder/temporal/common/metrics/grpc.go:113\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1120\ngo.temporal.io/server/common/metrics.NewServerMetricsContextInjectorInterceptor.func1\n\t/home/builder/temporal/common/metrics/grpc.go:66
\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1120\ngo.temporal.io/server/common/rpc.ServiceErrorInterceptor\n\t/home/builder/temporal/common/rpc/grpc.go:132\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1120\ngoogle.golang.org/grpc.chainUnaryInterceptors.func1\n\t/go/pkg/mod/google
.golang.org/grpc@v1.47.0/server.go:1122\ngo.temporal.io/server/api/historyservice/v1._HistoryService_QueryWorkflow_Handler\n\t/home/builder/temporal/api/historyservice/v1/service.pb.go:1639\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1283\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:1620\ngoog
le.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.47.0/server.go:922"}

What is the output of

 tctl taskqueue describe --taskqueue zeus.api
bash-5.1# tctl tq desc --taskqueue zeus.api
         WORKFLOW POLLER IDENTITY         |   LAST ACCESS TIME
  1@zeus-worker-temporal-9544ffc5f-vwmv6@ | 2022-11-09T05:03:23Z
  1@zeus-worker-temporal-9544ffc5f-fdghk@ | 2022-11-09T05:03:23Z

I have found new workflow hanging after I have upgraded the temporal cluster to v1.17.6 yestoday.

Did you change the matching engine partition configuration by any chance?

I’m running out of ideas. I guess that DB is not fully MySQL compatible. Or it uses async replication that is not fully consistent.

Did you change the matching engine partition configuration by any chance?

No. I known it’s not changeable after cluster created.

I’m running out of ideas. I guess that DB is not fully MySQL compatible. Or it uses async replication that is not fully consistent.

Thanks very much. I update the mysql address from the SQL Load Balance(read/write split) to the direct address to master node. Then observe the result.

@maxim
Which Consistency levels temporal required?

  • Eventual consistency
  • Session consistency
  • Global consistency

I’m not an expert on PolarDB consistency levels. Temporal needs to be fully consistent within a single shard.