Tradeoffs compared to Zeebe?

I’d be interested in what the team and current users feel are the pro’s/con’s/equivalents and best use cases of Temporal vs https://zeebe.io/

I’ve had a look at https://docs.zeebe.io/introduction/what-is-zeebe.html and, while helpful, I don’t have a complete view of the nuances and considerations for a decision.

Certainly a big one is a BPMN compliant GUI but that’s not a showstopper requirement for me.

1 Like

I will respond within a few days with a more comprehensive answer. The biggest difference in my perspective is that we execute your code directly, there is no BPMN in the middle.

I’ve wrote extensively about this elsewhere before, but perhaps I should update it in a blog post.

Here’s one of my more recent comments (https://news.ycombinator.com/item?id=24216400 - see reply in thread too).

1 Like

Great answers from @mrsaints. Here’s my take

Update: I’ve updated my original answers after some great elaboration/correction by @mrsaints

I want to preface this answer by saying that I’m by no means a Zeebe expert. I’ve done my best to familiarize myself with the major players in this space (Zeebe included) but it’s a relatively new product that changes quite regularly.

Before diving into individual features, I want to say that I think there is a much more fundamental difference between Zeebe and Temporal. Temporal is a developer product which enables users to write highly reliable and scalable business logic with pure code. Zeebe is a product for businesses which enables their developers to cleanly surface implementations of business processes. Regardless of the technical improvements over Camunda, BPMN is still at the heart of Zeebe. When you write code in Zeebe it is not executed directly.

How are Zeebe and Temporal similar?

  • GRPC based
  • Can both theoretically support any language over GRPC. (3 languages supported idiomatically by both, Zeebe has 10 languages supported with generated bindings)
  • Cloud native and can run on k8s
  • Horizontally scalable
  • Workers are embeddable

How are they different?

Zeebe

  • Not Open-source but rather source available
  • Workflow state is directly stored on disk using relatively new RocksDB

image

Temporal

  • MIT Open-source
  • Powered by pluggable database (currently PostgreSQL, MySQL and Cassandra)
  • Well known users running in production for some time
  • Built in taskqueue mechanism, no need for external queues
  • Temporal directly executes your code. Use the same testing, debugging processes you already know and love.
  • Workflows can run forever
  • No visual workflow tooling out of the box
  • Advanced versioning mechanism which makes it possible to version running code
  • Built in archival (also able to be shipped)
  • Integrated visibility API
  • Can send signals to running workflows
  • Multi-region replication
  • Support for queries out of the box

Mostly true, but I think some elaboration is needed (based on my experience working with Zeebe).

Does not support long running workflows

Actually, Zeebe is sort of indifferent about this. In the same way that Temporal does not recommend having a very large workflow because the workflow history might get too large, Zeebe also has a similar problem in the sense that keeping a long running workflow alive will likely result in a large state, and potentially run into memory / replication issues. There are ways around this though.

No versioning process built into the system

There is versioning. But, compared to Temporal, there are arguably more decisions that needs to be made here. e.g. you cannot “upgrade” an in-flight workflow instance, and you need to track the version of a workflow you are starting. The deficiencies are arguably a result of having a separate workflow DSL.

Essentially, a workflow is defined in BPMN. Before you can start a workflow, you need to deploy (“upload”) that workflow “definition” onto Zeebe. With Temporal, it is a matter of deploying an updated worker with the appropriate GetVersion marker (assuming Go).

When you deploy it onto Zeebe, it will essentially store it as a new version. It is fairly easy to integrate this with CI, and call Zeebe with grpcurl or similar. When you start a workflow, you can decide what version of a workflow definition you want to use or you can use the default which is the latest. You cannot however, set the version ID as it is auto-generated.

And again, you cannot “upgrade” an in-flight workflow. This is rather important to keep in mind of because it is easy to introduce breaking changes in Zeebe’s equivalent of Temporal “activities” as it is not so straightforward to validate the BPMN data piping / inputs and outputs ahead of time compared to say a Temporal workflow defined in Go using static types.

No built in archival (can be shipped)

To elaborate, workflow instances are not directly queryable in Zeebe. And, they are only kept for as long as they are needed / running (so not very long). Instead, any events that happen in Zeebe are “exported” to another data sink, e.g. Postgresql, ElasticSearch, etc.

Out-of-the-box, Zeebe can export to sinks like Kafka, ElasticSearch, and SQL (via Hazelcast). Though, no where near as convenient as Temporal from my experience.

No built in visibility tool, records must be shipped

Similar to the previous point. Unlike Temporal, you cannot query the workflow engine directly. You need to ship the data to some sink. Assuming you do this well, then you can have both visibility, and archival indefinitely. This is often the source of many problems from my experience.

Its heavy eventually consistent nature means you do not actually know if you received everything. And, a failure to export (on the side of the workflow engine) or the failure to receive, and process (on the side of the sink) could result in having a partial view of what is happening in the workflow engine. Consequently, you will often be acting on information which may or may not be correct at any given time.

No way to signal running workflows

This is possible, but maybe not with the same flexibility, and powers as Temporal’s signal.

See https://docs.zeebe.io/bpmn-workflows/message-events/message-events.html for example.

You can also achieve Temporal’s selects with https://docs.zeebe.io/bpmn-workflows/event-based-gateways/event-based-gateways.html

Queries only available via plugin

This is again a deficiency of not being able to query the workflow engine directly. You will be querying whatever sink you shipped the Zeebe events to, so I wouldn’t really say you need a plugin, but rather a whole system or API designed for a specific sink (zeeqs in the case of this comment).

All history, visibility can only be accessed via optional ElasticSearch. While Zeebe service itself may be able to scale to 10000s of concurrent workflow invocations, ElasticSearch is the bottleneck.

Out-of-the-box, yes. But, not exactly true. This depends on how you ship the “history”. You can ship it to Kafka or even a managed database (assuming there is an exporter for it) for example. So depending on your set-up, you arguably can avoid having a bottleneck.

1 Like

I really appreciate the details replies above and have followed through to the referenced links as well.

My primary use case is definitely internal “backend” orchestration among microservices not so much business facing big block system though a visual BPMN or diagram (even if inferred and not executable) would have been nice for engineering discussions with other groups.

I’m going to recommend we do a POC with Temporal and the very active responses here are definitely a plus.

3 Likes

@alecl great to hear, let me know if you need any help at all ryland@temporal.io.

@mrsaints I genuinely appreciate the time and effort you put into elaborating on my answers. I plan to backport your corrections to my original response. I learned quite a few things from your response and have since done even more research on Zeebe.

1 Like