A Letter to Cadence/Temporal Community

Below it’s a copy from my post in Medium: A Letter to Cadence/Temporal, and Workflow Tech Community | by Long Quanzheng | Dec, 2022 | Medium
I think it’s easier to read/search if I can copy/paste it here for discussion:

=================================================

A Letter to Cadence/Temporal, and Workflow Tech Community

Dear developers,

Recently I made the two posts about iWF:

Here I wanted to share with you the story of iWF, to give some background about the project. I hope this will help answering some questions around iWF, and you will feel more comfortable of trying it out.

Start from Cadence/Temporal

First of all, I really, really love the Cadence project.

I joined the Uber Cadence team in 2017 at the early stage of the project. I still feel very fortunate today to have worked closely with Maxim/Samar(Temporalio CEO/CTO) and other Cadence teammates to build this open source project.

For the first two years, I spent most of the time focusing on the Cadence backend service. I am proud of how my team has designed and built on the service.

Then I started to shift my interest into how users are using Cadence in 2019 —I feel like this is quite natural — since it’s so powerful, how the users are using it? Do they like it?

Maxim/Samar left Uber and founded Temporal in late 2019. Temporal is great fork from Cadence and I am also liking it. I was thinking to join the company but I couldn’t make that decision due to many reality reasons.

Supporting the community

So I became very passionate in helping every user to use Cadence, internally and externally. Even after I left Uber to join Indeed, I have been still active in the Cadence community.

I have tried everything I can think of to help everyone to use it. After answering lots of questions in Slack channel, I realized that there are lots of knowledge about Cadence/Temporal that I wanted to share. So I started to write:

Even though I don’t like the unfortunate split of the community into Cadence and Temporal, I also tried my best to help Temporal side, as they are 99.99% identical. Almost all the knowledge/tips can be applied interchangeably.

But then I also realized that there are so many tips, tricks, gotchas that a Cadence/Temporal require users to know in order to feel comfortable with using it.

I have used many different technologies, including AWS services. I never seen another that requires users to learn so many things, and change so much in their normal way of thinking, writing code, and maintaining a project.

Feel the pain myself

My role got changed “a little bit” in late 2021. I finally became a really heavy user of Cadence/Temporal myself at my company. Though I also used Cadence workflow before (mostly using Cadence to build Cadence:) ) , that was still a completely different experience.

We built very complicated workflow applications in Cadence and Temporal (first self hosted Cadence and later moved to Temporal cloud). I am proud of it. It’s likely to be the most complex workflow in Cadence/Temporal Java SDK — we discovered a critical bug in the SDK that should have been found and solved years ago, if anyone had used Cadence/Temporal Java SDK with versioning + timer + multithreading APIs.

I feel the pain of using Cadence/Temporal that I had never imagined.

Every line of the code needs extraordinary care to maintain otherwise it would run into serious problems in production. Changing every line of code is so painful, because the workflow replay and determinism makes it super difficult. On the other side, the real world business required us to change all the time.

Beside that, the unit testing is also super difficult and hacky as all the SDK APIs are static methods.

It’s even harder for maintaining this complicated workflow in a team effort. There are many engineers don’t have enough interest or patience to spend lots of time in learning all those tricks/tips/knowledge. I had heard that how Cadence/Temporal applications became hard to maintain after the initial team members left because the new ones are no longer excited and willing to learn. I realized how true that is.

We all deserve a better life

I have been trying very hard to improve the user experience for Cadence/Temporal since 2020.

I looked into and also used other workflow engines myself. I found that most workflow engines don’t have that learning curve (except for SWF which is the precedence of Cadence/Temporal), but they are all missing a lot of power.

So I summarized a lot more design patterns like this to help . But it doesn’t really solve the core problem — the direct user experience is from the Cadence/Temporal SDKs.

Then I tried to improve the SDKs by adding more docs(like this) and features(like this) in the SDK directly.

It’s not helpful enough. Then I tried to build a new SDK, tried to create a new programming language , but I failed.

NOTE: I am a big fan of Kotlin and was thinking of building a new JVM language like Kotlin to directly support Cadence/Temporal.

Inspiration of iWF

Just all in a sudden the idea of iWF came to my mind in late 2020, after I got inspired by two ideas:

  • I learned that AWS Step Functions is built on top of AWS SWF (the regular workflow only. The “express workflow” is not) . It removes all the barriers and difficulty and makes it easy to learn and use. (as a fact, you can still see some APIs are based on SWF)
  • My teammate inspired me that even a DSL workflow engine like Step Functions can use native code rather than JSON/YAML.

I somehow combined these two ideas together, and wondering what it would look like if Step Functions is not in JSON. After all, the biggest disadvantage of Step Functions is “low code”. Using JSON makes it difficult to maintain a complicated workflow — you can’t do code refactoring or even unit testing.

Since Step Functions is built on SWF, I can just build on Cadence/Temporal!

The prototype came out in early 2022. I shared with many engineers internally and got very positive feedback. Then I shared it publically with the Cadence/Temporal community.

There are lots of very useful feedback from the community, including some folks from Temporal. Specific thanks to Ben Slater from InstaClustr and Emrah. They even spent lots of time with me to help finding good names for the core APIs in iWF .

Build it up

The prototype is cool but what is next?

It really feel like being chosen. I realized that I have to build it up.

Though it’s not a small or toy project, Thanks to the power of Cadence/Temporal, the iWF server is not too hard to build. The server just needs Cadence and Temporal as dependency, which I’m so good at it.

Also I am so lucky to have some community members to join me from the beginning. Special thanks to everyone who have helped or contributed to the project.

When building iWF , I tried to apply all the learnings from Cadence/Temporal SDKs. With the help from everyone who I can ask questions, I tried to make every API to be minimum, simple, clear and easy to use. Even the names of APIs (start and decide) have cost me several months to finalize. ( see this GitHub thread , and it’s just part of the conversations) .

I strongly believe that being explicit is important for APIs. There are so many implicit details in Cadence and Temporal SDK which is the one big reason of the learning curve.

For example, I found that very few users know that they can upsert an array of items into a search attribute. This is how Cadence/Temporal service is supported, which is very powerful. But the SDK API doesn’t show it directly(and we didn’t even have good documentation anywhere :frowning: ). It’s very hard to know this capability. In iWF SDK, we let user code define search attribute type as array explicitly.

Is it deprecating Cadence/Temporal?

No, I don’t think so.

Cadence/Temporal is the foundation of iWF . The core implementation is only 400+ lines of code today. Without Cadence/Temporal, it will be 100+ harder to build using any other technologies.

Though most business applications will find iWF is better, there is always more projects like iWF finding that using Cadence/Temporal directly is a good choice. As long as the users are ready to spend enough time to learn and willing to pay the effort to maintain it, it will be worthwhile.

A metaphor for this is Java vs C/ASM. Thanks to modern language like Java, Most developers don’t need to learn the details of how to manipulate memory and registers. iWF is like an interpreter on Cadence/Temporal. That’s one of the reasons for the name of “iWF”.

Temporal as a company has been doing a great job on supporting and documenting all we need to know to use this technology, much better than what I can do as an individual.

The controversial part

The biggest argument of Cadence/Temporal over iWF is the “normal looking”of workflow code . Cadence/Temporal let user define a workflow as a single function/method while iWF require to split into different pieces (workflow states).

I want to share my thoughts on this debate.

First of all the state machine fashion is not created by iWF. All other popular workflow engines like Step Functions, Airflow, Netflix Conductor all require user to define workflow explicitly as multiple steps/states. They are are quite successful.

Secondly, Cadence/Temporal is also essentially state machine. The SDK runs user code as an implicit state machine internally. This explicitness is the cause of non deterministic error issues.

The normal looking code is a “trap ”. I also want it of course, to be honest. But our computer, operating system and programming language are never designed that way — requiring human code to be replayed is anti-pattern. In other words, we would sacrify too much for this “normal looking” code. Especially with the versioning APIs, the user code will become more and more messy over time.

In practice, having to split a workflow into multiple states isn’t always bad. It makes it easier for team to swarm the project from beginning. And likely you don’t need too many of them as each state can do so many things with the two API methods.

Some final words

Finally I would like to Invite you to try out iWF and give me some feedback. Currently only Java and Golang SDKs are provided. You are also more than welcome to contribute more SDKs.

Hail Cadence, Temporal and iWF!

Happy new year!

Sincerely,

Quanzheng Long

1 Like

Hi Quanzheng, thanks for the write up and sharing this information. Read your mentioned stackoverflow articles and have to say they are really useful. You seem very passionate about this technology and this is something we definitely share here.
I was not involved in the history part in your writeup but I am sure it was not easy for anyone involved and had ups and downs and I hope you and everyone can move on with all that stuff (no need to go into it pls).

I also enjoy the questions you post in slack regarding your company use cases for Temporal Cloud and hope to keep working with you in the future on that and anything else.

That said, please understand that this is a Temporal community forum. The reason it exists is to focus on Temporal and all things Temporal. So things like how to use it, best practices, help with deployments, bugs, issues, and many other things, but all related to Temporal itself.
Even tho we do try to help with comparisons with other technologies, we do try to always keep an open mind about it and not go into it too much as again, the focus of this forum and this community is the technology people come here to learn about and get help for.

I do feel that some of your statements in this post do go a bit above and beyond this into a realm of trying to use the platform to promote an alternative technology. And yes even tho we all do that to some extent and there is nothing wrong with it, I do think that there are more appropriate venues to do this which is not this forum.

Please do not take this personally as it is not. It’s more to say like hey we all welcome you to the Temporal community and believe users can benefit from your vast knowledge around technologies we use, but would like to ask to channel this into helping our community here about specific Temporal questions if possible while at the same time for you to try to use different venues for promotions of others. Thank you again for this post and wish you a happy new year!

Hi Tihomir,
Thank you for your response. I also really appreciate all your help.

Just to be more clear, is that because of this letter talking about Cadence? Or because of it’s building a layer on Temporal?
I want to be clear so that I won’t violate this again.
If it’s about Cadence then I will just remove the Cadence part right now. I hope building a layer on Temporal is not a problem because @maxim has told me that I am always welcome to build a platform on top of Temporal, when I brought this to him.

Thanks and happy new year!