Action batching

Hello

I’m evaluating Temporal for our use cases and I’m stuck at implementing batching.

Essentially I’ve got a workflow that runs with high concurrency (200+ per second) and needs to index data to ElasticSearch. ElasticSearch performs best when indexing data in bulk and not one by one. So what I think I need is to somehow batch inputs to the indexing activity so that it receives N records to index, not just one.

Is this possible with Temporal? How?

1 Like

I would do the following:

  • An activity that adds the ES record to a shared queue. The message in the queue includes activity task token.
  • This activity uses asynchronous completion to not complete when its function returns.
  • A separate thread periodically consumes the records buffered in the queue, and sends a batch request to ES.
  • Once the batch request succeeds all the activities in the batch are completed using manual completion client.

Thanks for the reply.

An activity that adds the ES record to a shared queue

What do you mean by a queue in this context? Is it a Temporal task queue or just an external queue (e.g. RabbitMQ)?

A separate thread periodically consumes the records buffered in the queue

Similarly, would this thread run as a part of a Temporal workflow or external to Temporal?

What do you mean by a queue in this context? Is it a Temporal task queue or just an external queue (e.g. RabbitMQ)?

In memory queue.

Similarly, would this thread run as a part of a Temporal workflow or external to Temporal?

External to Temporal. Lives in the same process as the in-memory queue.