Hello
We have been noticing increased scheduled to start times for our workflows during specific intervals. For some workflows it also increases to 1 hr, usually it gets scheduled within 100-150ms.
Upon further investigation, we found that there’s a spike in persistence requests for GetWorkflowExecution, although we did not experience any spikes when it comes to new workflow creations during this specific hour
There’s also the following warn logs that appear during the same interval. Following is an example but it’s not specific to any namespace
{
"level": "warn",
"ts": "2023-12-04T12:38:03.484Z",
"msg": "Transfer Task Processor: workflow mutable state not found, skip.",
"shard-id": 1576,
"address": "10.115.241.3:7234",
"component": "transfer-queue-processor",
"cluster-name": "active",
"wf-namespace-id": "873381e3-3823-40ad-9992-610e2945fb90",
"wf-id": "BGV_DCP_ACC3791486273713_sjDCOhTjfEknPMxc",
"wf-run-id": "1f73edc4-5c48-4176-a35c-9a60fde454f1",
"queue-task-id": 18162649786,
"queue-task-visibility-timestamp": "2023-10-09T14:51:37.287Z",
"queue-task-type": "TransferActivityTask",
"queue-task": {
"NamespaceID": "873381e3-3823-40ad-9992-610e2945fb90",
"WorkflowID": "BGV_DCP_ACC3791486273713_sjDCOhTjfEknPMxc",
"RunID": "1f73edc4-5c48-4176-a35c-9a60fde454f1",
"VisibilityTimestamp": "2023-10-09T14:51:37.287475389Z",
"TaskID": 18162649786,
"TaskQueue": "digilocker-dcp-digio-queue-prod",
"ScheduledEventID": 61,
"Version": 0
},
"wf-history-event-id": 61,
"logging-call-at": "nDCTaskUtil.go:102"
}
These workflows had ended long time ago successfully as suggested by the visibility timestamp. Currently we’ve disabled the archival for old workflows
Could you please help? This issue is occurring almost everyday during the same hours