I’ve have a simple workflow to test failover scenarious. I see undesired behavious in following simple case (Activity method configured with heartbeat timeout 3 seconds and send heartbeat every second):
- workflow initiator process async start workflow
- first worker start workflow execution, I see workflow task executed in thread “workflow-method” and activity method started in thread like “Activity Executor taskQueue=“MAIN_TASK_QUEUE”, namespace=“default”: 1”
- I stop first worker to simulate crash and see that second worker try to execute activity task but fails with error " NOT_FOUND: invalid activityID or activity already timed out or invoking workflow is completed". Retry and fail with same reason, so execution hangs in such retries.
Please explain me what I’m doing wrong.
Maybe code example will be helpful:
// Create and start workflow WorkflowOptions workflowOptions = WorkflowOptions.newBuilder() .setTaskQueue(IMainTaskQueue.MAIN_TASK_QUEUE) .setWorkflowId(UUID.randomUUID().toString()) .build(); ISimpleWorkflow simpleWorkflow = workflowClient.newWorkflowStub(ISimpleWorkflow.class, workflowOptions); WorkflowClient.start(simpleWorkflow::doWork, payload); ... // Activity options private ActivityOptions ACTIVITY_OPTIONS = ActivityOptions.newBuilder() .setScheduleToCloseTimeout(Duration.ofDays(1)) .setHeartbeatTimeout(Duration.ofSeconds(3)) .build();