Hi, we are currently running a kotlin/springboot webapp which uses temporal to run all of our logic in workflows. We’ve noticed that there is an apparent thread leak of thousands of threads with the name “grpc-connection-manager-thread-0”. They all have that exact same name. This thread looks to be used by the temporal GRPC client to communicate with the temporal server. We steadily leak these threads to the point where over the course of a single day we could have 6000 of these threads created. When we take a thread dump they are all in this state:
"grpc-connection-manager-thread-0" #894 [891] daemon prio=5 os_prio=0 cpu=0.25ms elapsed=15.06s tid=0x00007f402c01e720 nid=891 waiting on condition [0x00007f3c2b2fe000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@21.0.2/Native Method)
- parking to wait for <0x00000004cb0e88c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base@21.0.2/LockSupport.java:269)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(java.base@21.0.2/AbstractQueuedSynchronizer.java:1758)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@21.0.2/ScheduledThreadPoolExecutor.java:1182)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@21.0.2/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@21.0.2/ThreadPoolExecutor.java:1070)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@21.0.2/ThreadPoolExecutor.java:1130)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@21.0.2/ThreadPoolExecutor.java:642)
at java.lang.Thread.runWith(java.base@21.0.2/Thread.java:1596)
at java.lang.Thread.run(java.base@21.0.2/Thread.java:1583)
There are no obvious errors in our logs and I have tried logging io.temporal.serviceclient at DEBUG level based on some other threads here on the forum but still not obvious problems.
Has this issue been seen before? Any ideas on what could be the cause or how to troubleshoot further?
We are running version 1.22.3 of the SDK.
Thanks.