High CPU usage in worker nodes

We are using Temporal as a microservice orchestrator. We call close to 20 microservices. Most of the microservices return within a second, they have been defined as local activities. 3 APIs take 10-15 seconds, they are executed asynchronously using ActivityCompletionClient in normal activities.

During the performance test, we found that the activity/workflow worker node CPU is the bottleneck as it approaches 100%. We are using the following configuration:

WorkerOptions workerOptions =
        WorkerOptions.newBuilder()
            .setMaxConcurrentActivityExecutionSize(150)
            .setMaxConcurrentWorkflowTaskExecutionSize(150)
            .setMaxConcurrentLocalActivityExecutionSize(150)
            .build();

    WorkerFactoryOptions workerFactoryOptions =
        WorkerFactoryOptions.newBuilder()
            .setActivityInterceptors(new MDCActivityInterceptor())
            .setMaxWorkflowThreadCount(450)
            .setWorkflowCacheSize(450)
            .build();

We are unable to find any obvious problem from the thread dump. While we are working to optimize our code, we wanted to understand if any Temporal config changes can help us.

Ideally, you would use some sort of CPU profiler to find such bottlenecks. In Java, GC can also cause high CPU.

You can also rate-limit the number of workflow tasks the worker is allowed to process per second.