Serialization of child workflow using a parent long running co-ordinator workflow

Hi Temporal community,

I’m designing a service and evaluating Temporal for our needs. I’d appreciate feedback on whether the “entity workflow” pattern is appropriate for our scale and requirements.

Use Case:
The service manages millions of entities. Each entity receives update messages from external data pipelines, and we process these updates.

Constraint:

Updates to the same entity must be processed sequentially. Concurrent processing of the same entity is not allowed. There are multiple consumer for the external datapipline from where update messages are arriving.

Current Idea:
One co-ordinator workflow per entity. It should be a long running workflow which maintains a queue inside it and it accepts signals from external message consumers and process one message the queue at a time using child workflow started from the co-ordinator workflow.

Scale & Throughput

Total entities: 5M (current), potentially 100M (future)
Average message rate per entity ~1 message / 4 hours (350 messages/sec for 5M entities)
Processing time per message: Less than 1 minute

Questions

  1. Scale: Is 5M concurrent open workflows reasonable for a production Temporal deployment? What about 100M?
  2. How frequently one workflow should perform continue-as-new ?
  1. Temporal doesn’t have any problem with a large number of parallel workflows. Billions of workflows are possible. 350 messages/sec is at the low end (at least for Temporal cloud).

  2. There is no strict rule about this. Usually, we recommend calling it at least once a week to support code changes. We are working on a feature (trampolining) which would support proactive continue-as-new for version upgrades when new deployments are rolled out.

Thanks @maxim .