Parent workflow unable to complete after the child workflow completed

fouad · August 16, 2023, 3:20am

Hi!

I’m facing an issue where the parent workflow never completes
after the child workflow is completed.

The child workflow is completed successfully and we can assert the claim
by looking at the artifacts produced and the logs.

{
  "executionConfig": {
    "taskQueue": {
      "name": "parent-cycle",
      "kind": "Normal"
    },
    "workflowExecutionTimeout": "0s",
    "workflowRunTimeout": "0s",
    "defaultWorkflowTaskTimeout": "10s"
  },
  "workflowExecutionInfo": {
    "execution": {
      "workflowId": "parent-cycle-1",
      "runId": "db80c230-3bdf-11ee-90d7-00155d1836bf"
    },
    "type": {
      "name": "ParentWorkflowV1"
    },
    "startTime": "2023-08-07T00:17:29.270762753Z",
    "status": "Running",
    "historyLength": "10",
    "executionTime": "2023-08-07T00:17:29.270762753Z",
    "memo": {

    },
    "searchAttributes": {
      "indexedFields": {
        "BuildIds": "[\"unversioned\"]"
      }
    },
    "autoResetPoints": {

    },
    "stateTransitionCount": "6",
    "historySizeBytes": "5277",
    "mostRecentWorkerVersionStamp": {

    }
  },
  "pendingChildren": [
    {
      "workflowId": "child-cycle-1",
      "runId": "fcf59eea-3bdf-11ee-8c3b-00155d1836bf",
      "workflowTypeName": "ChildWorkflowV1",
      "initiatedId": "6",
      "parentClosePolicy": "Abandon"
    }
  ]
}

The child workflow history and execution doesn’t seem to exist anymore.

$ TEMPORAL_ADDRESS=localhost:7777 temporal workflow describe  --namespace customer1 --workflow-id child-cycle-1
Error: workflow describe failed: sql: no rows in result set
('export TEMPORAL_CLI_SHOW_STACKS=1' to see stack traces)

The namespace configuration.

$ TEMPORAL_ADDRESS=localhost:7777 temporal operator namespace describe customer1
  NamespaceInfo.Name                    customer1
  NamespaceInfo.Id                      ac179d56-3be0-11ee-a34e-00155d1836bf
  NamespaceInfo.Description
  NamespaceInfo.OwnerEmail
  NamespaceInfo.State                   Registered
  Config.WorkflowExecutionRetentionTtl  24h0m0s   
  ReplicationConfig.ActiveClusterName   active
  ReplicationConfig.Clusters            [&ClusterReplicationConfig{ClusterName:active,}]
  Config.HistoryArchivalState           Disabled
  Config.VisibilityArchivalState        Disabled
  IsGlobalNamespace                     false
  FailoverVersion                                                                      0
  FailoverHistory                       []

I’m not if this information can help but at some point the cluster was under provisioned and the queues (task_queue, timer_queue) become quiet large. ~20millions rows.
My assumption is that a race condition happened between the history cleanup of the child workflow and the signal to the parent workflow.

Setup:
version: 1.21.4
database: postgres 13 (aurora)
os: EKS

What can cause such behavior ?

Thanks

tihomir · September 19, 2023, 9:44pm

Hi, sorry late response but are you still running into this issue? My guess would be that the child workflow after completion was already removed by namespace retention policy set to 24hrs per info you shared.
I’m not yet sure why that would cause your parent workflow not able to complete. Can you share event history of this parent execution?

Topic		Replies	Views
The status of the child WF of the parent WF that has completed execution cannot be updated Community Support general-impl , web-ui	5	169	April 18, 2024
Parent Workflow getting completed if Child Workflow is failed Community Support	1	432	June 20, 2023
Parent Workflow is getting closed when Child Workflow is getting replayed, and hence the parent workflow is not getting response from the replayed child workflow Developer Corner	5	15	August 27, 2024
How to get all completed workflows from a parent-workflow in golang Community Support	6	446	August 7, 2023
What happens to childWorkflows when parent use ContinueAsNew Community Support continue-as-new	5	1127	March 14, 2022

Parent workflow unable to complete after the child workflow completed

Related topics