Ordering Guarantee of messages / Sequencing

We have use cases for ordering of messages.
Needed to understand if there is any case of ordering/sequencing of messages with Temporal?
Similar to Kafka topic partition, is there anything on temporal for sequencing.

As per documentation by temporal: it is not provided, can somehow confirm this or is there some workaround for this

→ do not have any ordering guarantees
→ Task Queues support server-side throttling, which enables you to limit the Task dispatching rate to the pool of Worker Processes while still supporting Task dispatching at higher rates when spikes happen.

@PARIKSHIT it is hard to give recommendation without understanding your requirements.

  1. Do you need global ordering? Or per some business level ID like a customer?
  2. What is the maximum rate of events per such ID/partition?
  3. Why do you need the ordering in the first place?

Hi Maxim,

We need to processes messages in our queue in a defined order: Since all messages update the same document. If all the messages are processed at once, we land up into optimistic locking exception on the same document. Hence we were using a Kafka topic partition to guarantee ordering of messages.
But now we want to move to Temporal.

Rate of events: x > 0 and x < 2000 (These events/updates can be triggered concurrently, we need to control the processing on the consuming side)

I am trying a POC with temporal currently, where i have defined multiple activiy methods: @ActivityMethod which gets called under a @WorkflowMethod when a worker picks up a message.
Here i see that the acitivities are always executed in sequence. Even I bring down the worker and then bring it up and replay the workflow again, it starts from the activity it left off.
Just wondering is this the way to potentially order the execution of events, Or do we have better way in Temporal to order the events?

Is 2000 messages per second for a single document or across all the documents?

Hi @maxim

Yes, in a worst case scenario, we can have upto 100 messages coming to the backend at once.

To the same document?

Yes this can happen to the same document

The usual pattern is to have a workflow per business entity (document in your case) and send signals to it. Then, the workflow can process them in order. But this design doesn’t work for high update rates. If you need 100 requests/second to the same workflow, then it wouldn’t work. If you can rate limit them to something like 20 then it could work.

Understood.

Additionally If I were to create multiple workflows for all the updates (1 workflow per update which would be carrying the JSON payload to update single document), can I have ordering at a workflow level

No, we don’t support ordering across workflows yet.

Hello @maxim ! Temporal currently support ordering across workflows? I have the same demand here

What is your use case?

I have process that sends a lot of events (signals). All signals are sent to respective tenant long running workflow, in which acumulate this signals in a array inside workflow for ordering.

For each tenant workflow was created 2 limits check:

  • Per signal
  • Per workflow size

When limit is reached, a continue as new is performed.

The problem is: when workers are restarted for some reason (reach pod CPU/memory limit, for example) the new worker don’t pickup the workflow immediately. I see that a time is spended until new workers pickup workflow and show logs again (I don’t know the cause of this)

So from the time of restart until the workers continue process signal again, the workflow already has reached the signal or workflow size limit, cause the continue as new is just performed if worker execute it and check the size of array signals or workflow size.

I want to not use a long running workflow anymore, and open 1 workflow per signal, but for this, I need to ensure that workflows will be executed in order.

What is the average and peak rate of signals per workflow ID?

Analyzing the json history file, this is the current rate, but we has a projection to increase this for 500x (currently has just 1 tenant on the cluster, and the projection is 500 in max):

Average rate: 0.91 signals/second
Peak rate: 1.00 signals/second

I’m confused by your answer. I asked about the maximum rate per ID (tenant). You said that the rate is going to increase because you are going to increase the number of tenants. How does it relate to the rate per tenant?

Let me explain:

  • When I need to scale, the number of “senders” are increased. This senders are pbx’s

  • Each sender has data about all tenants in same time. For example:
    In pod 1, the number of received calls of an agent is 10.
    In pod 2, the number of calls of this same agent is 5
    In pod 3, the number of calls of this same agent is 1

    this happens because the loadbalancer can send the call to pod1, pod2, or pod3, making the counters different in each pod.

So if I scale one more pbx, I’ll have this same agent in pod4, increasing the number of signals that will be sent to this tenant long running workflow.

So, what is the average and peak rate per single tenant workflow you need when the system is fully scaled?

Something around this:

Average rate : 455 signals/second
Peak rate : 500 signals/second

A single Temporal workflow cannot support such a high rate of signals. Currently, Temporal is not the right technology if you need to guarantee the complete ordering of messages at such a rate.

Temporal scales out with the number of open workflows. For example, it can process hundreds of thousands of events per second if they are distributed over many workflow instances, and each instance processes a maximum of a few signals per second.