Sudden surge of costly operation, how to handle

madhu · August 29, 2023, 4:47pm

I am running a subscription management system on temporal, where each of my subscription is a separate workflow, this is in production for quite some time and severing us very well.

Now with time, we need a way to do bulk cleanup of unused/ underused subscriptions.

In my code, i have methods like
SubscriptionPlan {
@signal
upgradePlan();

@signal
downGradePlan();

@signal
pauseBilling();
@signal
resumeBilling();

@query
getCurrentPlan();

@signal
extendPlan(Date endDate);

@signal
end(String reason);
}

now, when end is invoked we will have to do massive cleanup of our infrastructure, and it is presently handled using a set of activity.

This is all working fine…

Since each of my subscription is a separate workflow, when some one sends end signal to say about 1000 workflows(subscriptions/plans) a massive cleanup is triggered across applications, and it causes many few upstream application to overwhelm,

since the retry policies are similar, again the retries for all the 1000s of workflows happens almost about the same time , causing upstream systems to overwhelm again and again, and causing down times.

What is the best way to enhance my workflows so that not more than 20 or 30 cleanups happens at a time, without disturbing my downstream and upstream systems.

i.e. i would still want to accept bulk end/cleanup signals , but the actual cleanup should happen in phased manner without overwhelming upstream systems.

Should i move the cleanup logic to a sperate workflow with say somethng like

CleanupWorkflow {

@signal
enqueue(UUID subscriptionId);

@workflowMethod doCleanup(){

//do stuff
…
…
…
plan.acknowledgeCleanup();
}

and let the subscription workflow await for a cleanupAckowelgement signal?

Also, i would like to clarify that the actual work done (cleanups in upstream are done by various other teams and they do not use temporal).so all those are actually REST Calls and these REST Calls are overwhelming upstream systems.

Topic		Replies	Views
Handling large amounts of incoming signals in a workflow Community Support go-sdk	5	496	March 22, 2024
Which is the best way to design a Subscription Lifecycle workflow? Having several signals inside one workflow or to have small workflows? Community Support java-sdk	1	573	August 9, 2020
Temporal best practises to execute some cleanup activity before terminating workflow Community Support java-sdk	1	1695	August 1, 2023
Multicast/ Topic like usecase Community Support general-impl	3	651	May 20, 2021
Task Queue per Customer or other approach? Community Support	1	411	January 26, 2024

Sudden surge of costly operation, how to handle

Related topics