Worker Memory usage

Pedro_Almeida · October 6, 2020, 6:35pm

Hello,

First of all, congrats on the V1 release of Temporal.

We were using only 1 worker for a couple of running workflows ( 50 000 ) approximately. Workflows instances are mostly waiting for signals to continue. We also have a poller Workflow ( only one instance ) with the NewContinueAsNewError every 2 minutes.

All seems to be working as expected, but the tasks stopped being processed by the worker.

Looking to metrics, Memory usage of the worker usually climbs all the way to 100% and drops down a few times. But one time it stopped executing tasks and the memory stayed on 100% until restart. After restart all tasks execution was normal again. The memory slowly climbed all the way up again, so we started another instance of the worker ( date : 10/02 ), but the result was the same, slowly climbed all the way up again . Bellow the graph of memory usage :

memory for each worker instance is 1024MB

I believe that workers keeps caches from workflows so that the bootstrap is faster when it needs to handle the same workflow afterwards. So the memory will always climb all the way to the top if the worker is not restarted?

If my assumption above is correct, is there any way to limit the memory available to be used by workers? Or maybe to change the way on how the “recycle” of workers cache is handled? Just to avoid reaching the 100% ?

Thanks for all the support,
Pedro Almeida

maxim · October 6, 2020, 6:43pm

Try setting workflow.SetStickyWorkflowCacheSize.

Pedro_Almeida · October 12, 2020, 8:54am

Hello Maxim,

SetStickyWorkflowCacheSize seems to be working.

Thanks for the help.

Pedro Almeida

Rizdian_Dinata · April 1, 2022, 3:27am

Hello Maxim,

i have same issue in java , where i can set workflow.SetStickyWorkflowCacheSize in java-sdk ?

Thanks,
Rz

tihomir · April 1, 2022, 2:03pm

@Rizdian_Dinata
for Java set it via WorkflowFactoryOptions#setWorkflowCacheSize, for example:

WorkerFactory factory = WorkerFactory.newInstance(client,
            WorkerFactoryOptions.newBuilder()
                    .setWorkflowCacheSize(myWorkflowCacheSize)
                    .build());

see also worker tuning page in docs.

Meli_Lee · June 22, 2022, 6:51am

would like to ask regarding this function as well PurgeStickyWorkflowCache
is it recommended to set this during the worker of workflow termination in pod?

tihomir · June 23, 2022, 2:20am

Hi @Meli_Lee,
for Go SDK:

worker.SetStickyWorkflowCacheSize(cacheSize) sets the cache size for sticky workflow cache. This cache is shared between all workers running in the same process and must be called before any workers are started.

worker.PurgeStickyWorkflowCache() resets this sticky workflow cache across all workers. It can/should be only called when all workers are stopped.

Hope this helps.

Meli_Lee · June 23, 2022, 12:57pm

so if we don’t set it during service termination is it fine right ?

tihomir · June 23, 2022, 8:07pm

If I understand the question correctly:

Worker cache is in memory, and calling PurgeStickyWorkflowCache after worker service terminated is not needed cause it would have no effect.
Once your workers are back up the server will treat them as completely new workers.

jack_king · June 8, 2023, 9:58am

Hi @maxim,
I have an problem, when I send 500 request in the same time, It will consume a lot of memory, then I wait for a long time, but memory not decrease, please help me, thank you.

tihomir · June 13, 2023, 4:21am

Can you give more info on this? Do you mean memory utilization on your worker processes?

jack_king · June 14, 2023, 1:52am

Hi @tihomir

You can see, ram memory alway increase, not decrease.

m.dalaty · March 20, 2024, 7:39am

HI @jack_king I have the same issue, how did you fix it?

alchemya · June 17, 2024, 11:15am

I also encountered the memory leak problem when using Python SDK. How should I solve it?

maxim · June 17, 2024, 6:27pm

Duplicate of We now have a memory leak problem - #2 by alchemya

alchemya · June 18, 2024, 6:37am

Hello @maxim

We configured the values of “max_cached_workflows” and “max_concurrent_activities” in our python code,but our memory is also increasing.
I searched all the posts about memory leaks in the forum and found this answer, but I don’t know how to configure the workflow.SetStickyWorkflowCacheSize method in the python sdk,Where can I set it?
Thanks,
alchemya

oshan · June 26, 2024, 10:45am

We are also constantly facing this issue. Any update on this @maxim ?

maxim · June 26, 2024, 12:16pm

I don’t think anything changed about this. We are not aware of any memory leaks in the SDKs. Setting the cache size is a way to limit a worker’s memory usage.

If you believe an SDK has a memory leak, we would need a reproduction to investigate.

Topic		Replies	Views
Identifying Out of Memory causing Temporal Workflows Community Support go-sdk , history , worker	3	810	January 3, 2025
Workflow Handler stopped Community Support go-sdk , mysql	7	1726	November 27, 2020
Are each workflow.sleep and worflow.await backed by a thread? if so when will the thread be relinquished? Community Support java-sdk	2	1884	February 17, 2021
Is there any documentation for Sticky Execution? Community Support java-sdk , metrics	3	2188	February 27, 2022
Open long-running workflows waiitng on workflow.await() Community Support java-sdk	2	527	May 26, 2023

Worker Memory usage

Related topics