When to use SQS?

Shannon_Tan · January 4, 2022, 8:48pm

I believe Temporal is a replacement for SQS, but is there any scenario in which you would prefer SQS over Temporal? Are there any major gotchas?

Also, is a single activity in a workflow an anti-pattern? I would imagine no and would only be a problem when it comes to wf/sec throughput.

maxim · January 10, 2022, 3:49pm

I assume that your question applies to any queuing solution (Kafka, RabbitMQ, etc.), not only SQS.

What are the features of a distributed queue?

Producers can enqueue tasks
Accumulate (backlog) tasks if consumers are down or slow
Deliver each task to a single consumer
The consumer can report task completion (ack) or failure (nack) back to the queue
If a task is not acked/nacked within a configured timeout it is considered nacked.
Some queues support extending the running task timeout (aka heartbeat)
Nacked tasks are redeliverd, possibly after some backoff interval
Some queues support Dead Letter Queues (DLQs). Tasks that are nacked too many times are moved to a DLQ.

What are the common limitations?

No transactions between the queue and other data storages
Maximum task execution time is limited even when heartbeating is supported.
The duration of retries is limited. So it is not possible to retry task for a few hours, for example.
Task cancellation is not supported
Error handling in case of task failures is very primitive. DLQ is the only mechanism.
Getting the status of a specific task is not supported.
Tasks are executed with at least once semantic

So when task qeueues are a good fit as an application level primitive?

The task is stateless. So its creation doesn’t require an update to some other DB as there are no transactions between the DB and the queue.
Idempotent task
Short task
Short duration/number of retries
No error handling besides retrying later from DLQ is needed
Human intervention is OK to deal with the messages in DLQ
No need to get a task status
No need to cancel the task
The task is fully independent and doesn’t depend on other tasks or cause the execution of other tasks.

My experience tells me that a very narrow set of scenarios fit these limitations. In most cases, tasks are not independent, can execute for a long time, require long retries, require actual error handling, and benefit from transactionality between the database and the queue.

Topic		Replies	Views
Why is implementing a queue an anti-pattern? Community Support general-impl	5	2607	August 23, 2023
Implementing queue with Temporal Community Support java-sdk	2	1098	July 22, 2022
Temporal in lieu of a queuing solution, say SQS, RabbitMQ or even Kafka? Community Support	3	2966	February 27, 2023
Could Temporal serve a queue scheduler use case? Community Support	15	1170	November 29, 2023
Recommended Retry DLQ pattern in case of Temporal Cluster unavailability Community Support general-impl , kafka , architecture	1	865	June 2, 2023

When to use SQS?

Related topics