Best practices for per-client sequential workflows with persistent storage, retries, timeouts, and recurring tasks

dbandarenka · August 22, 2025, 11:10am

Hi Temporal community,

I’m designing a production-ready backend system in Python that will handle millions of tasks per day. Users submit tasks via a FastAPI endpoint. Each task should be persisted in a durable storage (I’m thinking Postgres) so that its status can be queried (accepted, processing, done, failed). The tasks should then be executed asynchronously, with the workflow updating the task status in the storage once completed.

Task workflow examples:

Perform data enrichment via a third-party API
Update a search engine index
Send processed data to an analytics service

Requirements:

Tasks should execute concurrently across different clients, but sequentially for a single client (strict order per client).
Each task should have a maximum number of attempts, after which it either fails permanently or is rescheduled.
Tasks should support timeouts and cancellation.
Some tasks are recurring, running according to a schedule.
The system must be production-ready, scalable, and able to handle millions of tasks per day with observability.

Questions:

What is the best pattern in Temporal for ensuring per-client sequential execution while allowing full concurrency between different clients?
How can I persist task metadata/status in Postgres while letting the workflow update it asynchronously?
How should I handle max attempts, retries, and backoff for high-throughput asynchronous tasks?
What is the recommended approach for timeouts and cancellations of long-running activities in such workflows?
For recurring tasks, is using Temporal Schedules the most robust approach at this scale?
Any architectural guidance for ingestion (Kafka, queues, etc.), scaling workers, and ensuring reliability?

Thanks in advance for any guidance, best practices, or example patterns.

maxim · August 22, 2025, 7:01pm

What is the best pattern in Temporal for ensuring per-client sequential execution while allowing full concurrency between different clients?

What is the maximum task enqueue and execution rate per client in tasks per second? What is the maximum possible number of outstanding tasks per client?

How can I persist task metadata/status in Postgres while letting the workflow update it asynchronously?

With Temporal there is no need for an external DB as all workflow information is already durable.

For recurring tasks, is using Temporal Schedules the most robust approach at this scale?

Millions per day is a pretty low scale.

Any architectural guidance for ingestion (Kafka, queues, etc.), scaling workers, and ensuring reliability?

It depends on the answers to the above questions.

dbandarenka · August 22, 2025, 7:19pm

Hey, Maxim,

thanks a lot for the quick response!

What is the maximum task enqueue and execution rate per client in tasks per second? What is the maximum possible number of outstanding tasks per client?

Our requirement is to support hundreds of clients with up to tens-hundreds of tasks per second.

Sorry, not sure that I got the point of the second question. If you meant whether it’s okay for our clients to wait for a task to be executed – yes, some delay is okay, there is no need in zero latency. To be more specific, we need a service that provides a possibility of updating Search engine (OS/ES) indices. So that’s why this is important for us to apply those changes in order.

With Temporal there is no need for an external DB as all workflow information is already durable.

We need some sort of persistence for us so that our clients can set task and then query its status (whether it’s completed or not; that’s how they can execute synchronous changes by waiting for a task to be completed and only after that setting the next one)

Thanks a lot again!

maxim · August 22, 2025, 8:56pm

Our requirement is to support hundreds of clients with up to tens-hundreds of tasks per second.

Is these hundreds of tasks per second for each client?

We need some sort of persistence for us so that our clients can set task and then query its status (whether it’s completed or not; that’s how they can execute synchronous changes by waiting for a task to be completed and only after that setting the next one

If you model tasks as Temporal workflows, then these requirements don’t need a database. Temporal workflows can be queried and waited for by external clients.

dbandarenka · August 22, 2025, 9:28pm

Is these hundreds of tasks per second for each client?

Yes, it is. This is not our current load but we’d like to come up with a new solution instead of the current one so that we can have this load.

If you model tasks as Temporal workflows, then these requirements don’t need a database. Temporal workflows can be queried and waited for by external clients.

Will our client still be able to do it using our FastAPI (Python web framework) API? They should not know that they query Workflow status of Temporal.

maxim · August 22, 2025, 9:30pm

Temporal doesn’t support out of the box the following requirement:

Tasks should execute concurrently across different clients, but sequentially for a single client (strict order per client).

You need something like Redis streams to queue up tasks per client. Then you can have an activity that listens on that queue and starts Temporal workflows.

dbandarenka · August 22, 2025, 9:32pm

Okay, I got it, Maxim.

I think that’s all I had for now.

Thanks a lot!

Topic		Replies	Views
Newbie: Is there Native Temporal Solution for Enforced Hierarchy Task Polling? Community Support python-sdk	6	363	July 28, 2024
Ordering Guarantee of messages / Sequencing Community Support java-sdk	23	1465	March 5, 2025
Use-cases and questions Community Support	4	3323	January 5, 2021
Reliable writes to a datastore Community Support	5	1429	March 14, 2021
Implementing queue with Temporal Community Support java-sdk	2	1180	July 22, 2022

Best practices for per-client sequential workflows with persistent storage, retries, timeouts, and recurring tasks

Related topics