Concurrent Batch Email/SMS Sending

We’re using Temporal for email/sms campaign sending, and we need to do this as quickly and efficiently as possible. Each campaign could be up to 5 million recipients. Here’s our current setup.

  1. Start a parent workflow that gets batches of ~5000 recipients from the database, and then starts child workflows for each batch of 5k. Like this:
  let lastId: string | undefined;
  let i = 0;
  const childWorkflowPromises = [];
  while (true) {
    const campaignRecipientIdsBatch = await getRecipientBatchActivity(
      { campaignId, pageSize: maxRecipientsPerWorkflow, lastId }
    );

    if (campaignRecipientIdsBatch.length === 0) {
      break;
    }
    lastId = campaignRecipientIdsBatch[campaignRecipientIdsBatch.length - 1];

    childWorkflowPromises.push(
      executeChild(sendRecipientBatch, {
        args: [{ teamId, recipientIds: campaignRecipientIdsBatch }],
        workflowId: `send-campaign-${campaignId}-batch-${i}`,
      }),
    );

    i++;
  }
  await Promise.allSettled(childWorkflowPromises);
  1. The child workflow will take that batch, and start activities in parallel for each one of those recipient to do the actual sending.
export async function sendRecipientBatch({
  teamId,
  recipientIds,
}: {
  teamId: string;
  recipientIds: string[];
}) {
  await Promise.allSettled(
    recipientIds.map((recipientId) =>
      processRecipientActivity({ teamId, recipientId }),
    ),
  );
}
  1. The processRecipientActivity carefully sends an email/sms to that recipient. The way it does that is that if we fail to send for any reason, we wait 10 minutes for a retry, and first thing we do on the retry is check if we’ve gotten a successful send event already for this recipient, and if so we don’t continue.
  2. Since we’re rate limited on how many sends we can per second by our downstream providers, we are setting maxActivitiesPerSecond, so that even though we will have hundreds of thousands of pending activities at a time, we only try to process them at maxActivitiesPerSecond.

Does this sound like a good pattern? Any suggestions of how we should do things better or issues that might come up? We need to send as fast as possible, but we also need to be sure that we don’t double send.

@maxim would love your feedback, I’ve enjoyed reading your input on other people’s problems here. Thanks!

Make sure that 5k recipients doesn’t exceed the workflow input limit of 2mb.

It looks like that sending an email is a simple API call. Consider running all the sending from a single activity that iterates over the list of ids. The activity should heartbeat and include the last sent record index into heartbeat details.

There are other ways to implement batch jobs. See the Java samples that demonstrate this.

Okay that’s great insight, thank you!

Sending is a little more involved than just calling an API, for each recipient we need to run some validation, check if they’re opted out/suppressed, make sure we haven’t already sent, create a personalize email for them, then send. So we have them currently implemented as individual activities, but will definitely look into batching them in an activity.

One other question, we implemented the pattern explained above, but changed it so we only queued 1700 at a time per child workflow, since the limit is 2000. First question, is there a way to increase this to 5000? Or is the hard limit 2000? The next problem that we’re seeing in the child workflows are that we’re getting intermittent errors from the workflow, both EVENT_TYPE_WORKFLOW_TASK_TIMED_OUT (TIMEOUT_TYPE_START_TO_CLOSE) and WORKFLOW_TASK_FAILED_CAUSE_FORCE_CLOSE_COMMAND. These are retryable, so we end up finishing the workflow just fine, but I’m concerned that this won’t always be the case, and errors are always concerning.

I haven’t set any sort of timeout on my workflows, do the child workflows default to sometime? Also, what would be causing the ForceCloseCommand on the workflows? Would that be resource constraints? How would I fix that issue? We’re using Temporal Cloud for reference.

Thanks for your help!