“Execution not found in mutable state” - asynchronous activity completion, spring boot test

Hi! I observe strange behavior in temporal spring boot test. I use java sdk, temporal version is 1.23.1, also use temporal-spring-boot-starter-alpha.
In test I have the following annotations set:

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@ActiveProfiles("test")
@DirtiesContext
@DisplayNameGeneration(DisplayNameGenerator.ReplaceUnderscores.class)

In application-test.yml temporal config is:

  temporal:
    namespace: default
    workers-auto-discovery:
      packages: my.temporal.package
    start-workers: true
    test-server:
      enabled: true

In the test itself I inject workflow client:

    @Autowired
    private WorkflowClient workflowClient;

and then create and execute in the following way:

var workflow = workflowClient.newWorkflowStub(MyWorkflow.class,
WorkflowOptions.newBuilder()
.setTaskQueue(taskQueue)
.setWorkflowId(testInfo.getDisplayName()) //name of the test, there are no other tests with the same name 100%
.build());

WorkflowClient.execute(workflow::run,  myWorkflowParams)
                      .orTimeout(20, TimeUnit.SECONDS)
                      .join();

The last step of the workflow is an activity that completes asynchronously (multiple such activities in parallel). The logic in the final activity is:
get task token:

var context = Activity.getExecutionContext();
var taskToken = context.getTaskToken();
// save task token
context.doNotCompleteOnReturn(); 

then i have a kafka consumer, which receives an event, retrieves a task token and completes the activity:

private final ActivityCompletionClient completionClient;
// some code
completionClient.complete(taskToken, null);

It works as expected and completes successfully when i run it locally, and when I run this test alone but when I run all the tests in my service

io.grpc.StatusRuntimeException: NOT_FOUND: Execution "ExecutionId{namespace='default', execution=workflow_id: "my workflow id"
run_id: "7e859ffc-4a9d-43f5-a7e9-5d7d2574e34b"
}" not found in mutable state. Known executions: [], service=io.temporal.internal.testservice.TestWorkflowService@6e6a4869
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)
	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.respondActivityTaskCompleted(WorkflowServiceGrpc.java:4000)
	at io.temporal.internal.client.external.ManualActivityCompletionClientImpl.lambda$complete$0(ManualActivityCompletionClientImpl.java:116)
//...
Wrapped by: io.temporal.client.ActivityNotExistsException: ActivityId=null
	at io.temporal.internal.client.external.ManualActivityCompletionClientImpl.processException(ManualActivityCompletionClientImpl.java:287)
	at io.temporal.internal.client.external.ManualActivityCompletionClientImpl.complete(ManualActivityCompletionClientImpl.java:119)

start-to-close timeout for this activity is pretty long, and the time taken is less than this timeout.
I have another one workflow test in the same class, and when I run both tests in this class, they do not conflict with each other. I also tried to disable other temporal-related tests located in other files but it didn’t help.
Also I tried to use exact same annotations and bean injections as here samples-java/springboot/src/test/java/io/temporal/samples/springboot/HelloSampleTest.java at main · temporalio/samples-java · GitHub and it also had no effect.

Any ideas and recommendations are much appreciated.

It looks like the workflow with that ID has already been completed.

I’m having the same issue when I run all the tests together, I get an error - NOT_FOUND: Execution not found in mutable state. I have about 18 tests in one test class file, when I run that one file with test, all 18 tests succeeds fine, but when I run my entire suite, the same tests fail. It is hard to tell what could be causing the conflict. I will try and reproduce it in some sample test suite and provide over the weekend.

I am wondering if this issue is with test-server itself and not the temporal server

Hello! As the issue was happening only when other tests were run, I managed to root cause the issue to another test which was using a separate spring profile through @ActiveProfiles annotation. I believe that was creating a separate spring context and at random, some activity completion client complete calls were going to that context (separate test-server) which didn’t have that workflow running hence the NOT_FOUND error.

After I merged the profiles in one, this error was gone.

I think, this issue could also be addressed with separate task queues too.