We are doing some performance test with temporal. Our first goal is to see the throughput we can get creating workflows and how we should scale it.
We are using the java SDK, and mysql, and currently just trying to understand what are the elements that will limit our throughput.
We’ve found 2 things we are trying to understand:
1.- Once we have started many workflows and they are running (right now just dummy activities), we see that in a thread dump we name a good number of threads waiting for a lock.
This is the stack from a dump:
"workflow-method" #294 prio=5 os_prio=31 tid=0x00007ffbae935000 nid=0x15607 waiting on condition [0x000070000fee9000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000717c102c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at io.temporal.internal.sync.WorkflowThreadContext.yield(WorkflowThreadContext.java:83) at io.temporal.internal.sync.WorkflowThreadImpl.yield(WorkflowThreadImpl.java:406) at io.temporal.internal.sync.WorkflowThread.await(WorkflowThread.java:45) at io.temporal.internal.sync.CompletablePromiseImpl.getImpl(CompletablePromiseImpl.java:84) at io.temporal.internal.sync.CompletablePromiseImpl.get(CompletablePromiseImpl.java:74) at io.temporal.internal.sync.ActivityStubBase.execute(ActivityStubBase.java:44) at io.temporal.internal.sync.ActivityInvocationHandler.lambda$getActivityFunc$0(ActivityInvocationHandler.java:59) at io.temporal.internal.sync.ActivityInvocationHandler$$Lambda$327/1133104136.apply(Unknown Source) at io.temporal.internal.sync.ActivityInvocationHandlerBase.invoke(ActivityInvocationHandlerBase.java:65)
Any idea what is the reason for this locks and how can we scale this? We have a high number of threads locked on this condition.
2.- When starting instances, using
WorkflowClient.start (workflow.&run **as** Functions.Func1<...>, arguments) we see it takes top to 1 second to start when we increase the number of ‘starts’ running in parallel.
With small concurrency (1-3) it is a bit faster (~300ms).
Any guidelines of what are the areas or settings we need to pay attention to is highly appreciated.