Few workflows throwing java.lang.RuntimeException: Failure processing workflow task continously

Few of my workflows are constantly throwing this error, i dont think there is any breaking change in the workflow. any idea what could be going wrong?

i use temporal java sdk 1.0.9 and temporal server 1.10.x

2021-09-07 14:16:10.672 ERROR 1 --- [g-cloud": 32992] i.t.internal.worker.PollerOptions        : uncaught exception

java.lang.RuntimeException: Failure processing workflow task. WorkflowId=blah-workflow, RunId=0d072af8-64ad-45a6-808a-02011de2cace, Attempt=88885
        at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:349) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.wrapFailure(WorkflowWorker.java:279) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:79) ~[temporal-sdk-1.0.9.jar!/:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_292]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_292]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292]
Caused by: io.temporal.internal.replay.InternalWorkflowTaskException: Failure handling event 121 of 'EVENT_TYPE_ACTIVITY_TASK_SCHEDULED' type. IsReplaying=true, PreviousStartedEventId=119, workflowTaskStartedEventId=247, Currently Processing StartedEventId=119
        at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:193) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleEvent(ReplayWorkflowRunTaskHandler.java:140) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTaskImpl(ReplayWorkflowRunTaskHandler.java:180) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.replay.ReplayWorkflowRunTaskHandler.handleWorkflowTask(ReplayWorkflowRunTaskHandler.java:150) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTaskWithEmbeddedQuery(ReplayWorkflowTaskHandler.java:201) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.replay.ReplayWorkflowTaskHandler.handleWorkflowTask(ReplayWorkflowTaskHandler.java:114) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:319) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.worker.WorkflowWorker$TaskHandlerImpl.handle(WorkflowWorker.java:279) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:73) ~[temporal-sdk-1.0.9.jar!/:na]
        ... 3 common frames omitted
Caused by: java.lang.IllegalStateException: No command scheduled that corresponds to event_id: 121
event_time {
  seconds: 1629438240
  nanos: 324290394
}
event_type: EVENT_TYPE_ACTIVITY_TASK_SCHEDULED
task_id: 15740750
activity_task_scheduled_event_attributes {
  activity_id: "043a619e-d9b3-3bff-9eed-0c8130501328"
  activity_type {
    name: "StartSubscriptionCronIfNotExist"
  }
  task_queue {
    name: "INHOUSEQUEUE"
    kind: TASK_QUEUE_KIND_NORMAL
  }
  header {
  }
  schedule_to_close_timeout {
    seconds: 21600
  }
  schedule_to_start_timeout {
    seconds: 21600
  }
  start_to_close_timeout {
    seconds: 21600
  }
  heartbeat_timeout {
  }
  workflow_task_completed_event_id: 120
  retry_policy {
    initial_interval {
      seconds: 60
    }
    backoff_coefficient: 2.0
    maximum_interval {
      seconds: 6000
    }
    maximum_attempts: 5
    non_retryable_error_types: "c.s.s.p.exceptions.ManualInterventionRequiredException"
    non_retryable_error_types: "c.s.s.p.exceptions.BadRequestException"
    non_retryable_error_types: "io.temporal.failure.ActivityFailure"
    non_retryable_error_types: "c.s.s.p.exceptions.RemoteActivityFailedException"
    non_retryable_error_types: "c.s.s.p.exceptions.EntityAlreadyExists"
  }
}

        at io.temporal.internal.statemachines.WorkflowStateMachines.handleCommandEvent(WorkflowStateMachines.java:244) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.statemachines.WorkflowStateMachines.handleEventImpl(WorkflowStateMachines.java:199) ~[temporal-sdk-1.0.9.jar!/:na]
        at io.temporal.internal.statemachines.WorkflowStateMachines.handleEvent(WorkflowStateMachines.java:178) ~[temporal-sdk-1.0.9.jar!/:na]
        ... 11 common frames omitted```

Can you double-check that your workflow code is deterministic (see constraints here)?

Is it possible for you to share your entire workflow execution history?

Hi @tihomir Can you please advise on this I am facing same issue during my implementation where my childworkflow is completed successfully but “ReplayWorkflowTaskHandler” being called and workflow is not getting terminated.

Is it same or similar issue as original thread? Can you see multiple child workflow events like start/complete for same child id? are you using cassandra for primary persistence and if so which version?