Activity execution retried several times

Hi,
I have a problem that I can’t understand. First let me explain my setup. I have a workflow running in a Spring container that executes several activities hosted in workers running in different other Spring containers. These activities might be called also by other workflows running in other containers (I hope this is not a problem). Also, consider that I am running all this on my Windows 10 laptop, so I’m running the temporal server as a docker compose (so I have a LOT of things running on my -small- box).

Anyway I just execute one workflow in this scenario.
The problem (I see this consistently, with different activities):


Here I call an activity named “SaveCustomer”, it takes 7+ minutes from schedule to start (but this can be because of my “crowded” setup), but what I can’t understand is why it takes 11 attempts (this number changes) to get the activity executed. The error I get:

But then the activity gets eventually executed.
Nevertheless, I’m sure the activity has been registered on temporal server and the worker started in the hosting Spring container.

Another thing I don’t understand is that I’ve set the options for the activity like this:


So the ScheduleToStartTimeout is set to expire after 1 minute, still the workflow (which has a longer expiration timeout) still attempts to execute the activity, regardless of its timeout.

Can someone please help clarify, and suggest if there is a way to mitigate these issues?

Thanks
Andrea

An Activity Worker is essentially a queue consumer that receives Activity Tasks and executes them. When a worker is created the queue name to listen on is passed as a parameter to its creation method.

One important requirement, in this case, is that a worker must support every activity type that is dispatched to its task queue as for efficiency no filtering is done on the service side. If a worker receives a task for an activity type that it doesn’t know about all it can do is to reject it by failing with “Activity type is not registered with the worker” exception.

As I understand you create multiple workers, each supporting its own set of activity types, all sharing the same task queue. This leads to a situation when an activity task is picked by workers that don’t implement this activity and failed. After potentially multiple retries caused by the failures, this activity is finally picked up by the worker that knows about it and completed.

The solution is to use a separate task queue for each worker type. This way only activities that the worker supports will be dispatched to it. Think about a worker as a microservice and its task queue as a microservice endpoint address.

This assumes that activities are scheduled into appropriate task queues. In Java it is done by specifying the task list name explicitly through ActivityOptions:

      SleepActivity activity =
          Workflow.newActivityStub(
              <MyActivity>.class,
              ActivityOptions.newBuilder()
                  .setTaskQueue(<taskQueueForMyActivity>)
                  ...
                  .build());
1 Like

Thanks Maxim, this makes sense. In fact, all the workers were registered under the same task queue name :ok_hand:

1 Like