To resolve this, I’m using Promise.allOf(promiseList) to handle the promises. Inside the activity, I am performing some validations
However, I’m encountering a StatusRuntimeException with the message "NOT_FOUND: Workflow task not found.". This occurs when multiple activities are triggered.
I am not seeing any logs related to the error, gRPC request limit. I am load testing my application, and my sample input is just a simple string (‘Hello’), so there shouldn’t be a limit issue. The only thing I am doing is triggering all activities in a loop and waiting for all promises to resolve at the end of the workflow method.
can you show your workflow code or give pseudocode? my guess is we are not fully waiting for all activity promises to complete, either forgetting the .get() at the end of Promise.AllOf, so for example
Promise.allOf(promiseList).get(); // .get missing
or could be Promise.allOf unblocks due to some activity failing and we are getting this error when other pending activities try to complete, but workflow exec is already completed at that point.
ActivityOptions build = ActivityOptions.newBuilder()
.setStartToCloseTimeout(Duration.ofMinutes(5))
.setRetryOptions(
RetryOptions.newBuilder()
.setMaximumAttempts(1)
.build())
.build();
List<Promise<List<PayloadError>>> promiseList = new ArrayList<>();
List<String> results = new ArrayList();
for (int i = 1; i <= 300; i++) {
SampleActivity sampleActivity = Workflow.newActivityStub(SampleActivity.class, build);
promiseList.add(Async.function(sampleActivity::doThird, "Hello"));
}
Promise.allOf(promiseList).get();
for (Promise<List<PayloadError>> promise : promiseList) {
if (promise.getFailure() == null) {
results.add(promise.get().toString());
}
}
System.out.println(results);
The code implementation provided is a sample representation of my source code, and the same error occurs in both implementations.
I am using .get for resolving.
If an activity fails, there should be an “activity failure” event in the history, correct? However, no such event is present.
After the “task not found” error occurs, my workflow code starts replaying. While replaying might not be the issue, I want to understand why the “NOT_FOUND: Workflow task not found” error is happening.
don’t disable activity retries. set retries if needed to a small number (like 2, 3 would be ok) but when disabling there can be rare cases where your activity never runs on your worker.
Promise.allOf(promiseList).get();
handle ActivityFailure here so have:
try {
Promise.allOf(promiseList).get();
} catch (ActivityFailure e) {
// just log if needed as you can handle promise.getFailure != null in loop below)
}
I want to understand why the “NOT_FOUND: Workflow task not found” error is happening.
Do you see ActivityTaskScheduled events in your execution event history? Do you see WorkflowTaskTimedOut event at the point where you schedule your activities?
You can also look and share your worker metrics, specifically temporal_request_failure and filter it by operation RespondWorkflowTaskCompleted
The error itself means that your worker is trying to respond to service with workflow task completion (to send some commands to service, like schedule activities at Promise.allOf, but either this workflow task already timed out, meaning worker send this to service after default workflow task timeout (10s), or the workflow execution already completed when worker sends this.
Would really help looking at your event history json and sharing the timestamp of when you see this error.
Yes, you’re right. Using Promise.all might cause unblocking even if some activity fails, as described in the example above. However, I am encountering the same error even when I retrieve the promise results individually.
To illustrate, I’ve attached a sample Hello World workflow (originally sourced from the Temporal.io repository) that I modified to reflect my scenario. Additionally, I’ve included the history file, which contains the “WorkflowTaskFailed” event for reference
Can you describe the rare cases where this issue might occur? I’m asking because I have a scenario where I trigger multiple child workflows via promises, and in the thenApply method, I also trigger child workflows. Inside each child workflow, I have three activities.
The implementation works fine when I trigger a small number of asynchronous operations (say 150). However, when I increase the load to over 300, most of the child workflows run without any issues. However, in 1 or 2 child workflows, an activity gets scheduled, but the corresponding code does not execute, resulting in a timeout (ActivityTimeoutFailure). hellworldWorkflow-HistoryJson
When two or more activities are triggered inside an async function (whether it is called within thenApply or a single function), the same issue occurs. like below code
List<Promise> promiseList = new ArrayList<>();
for (int i = 0; i < 300; i ++) {
promiseList.add(Async.function( () -> {
firstActivity.composeGreeting("John B");
firstActivity.respondAge("John B");
return "completed";
}));
}
for (Promise p : promiseList) {
try {
System.out.println(p.get());
} catch (Exception e) {
System.out.println("exception - " + e);
}
}