Temporal in Production - Support and Upgrade

Hi!

Am a bit lost with finding information about which versions of the underlying technologies are needed, and what the upgrade path and support span is. Is there maybe sort of a matrix for that?

As example: Which versions of Elasticsearch are required? What happens if Elasticsearch releases upgrades that might be incompatible with Temporal? What is the upgrade path in such a case? Would you provide instructions/tools for doing the mapping/schema upgrades or reindexes between major versions of Elasticsearch? How soon would you provide these? Asking because that would decide whether we need a separate Elasticsearch cluster to not be blocked with other applications on the same cluster. Also conversely, how long would you support older Elasticsearch versions with Temporal?

So pretty much this set of questions as far as applicable, but for all underlying infrastructure, like MySQL, Kafka, Zookeeper, Grafana, etc.

I wouldn’t feel comfortable just relying on whatever gets delivered with the “batteries included” Helm chart, because I’m not really sure how the underlying 3rd parties the Helm chart sucks in handle all of this. Upgrading MySQL or Elasticsearch usually involves for us a measure of application code adjustments, and I fear we might end up with a logjam if we have little control over these underlying building blocks. Or would you expect all of this to happen seamlessly with future versions of Temporal and its Helm chart and I’m overthinking this?

Thanks
Stephan

1 Like

Hi

Temporal server only support 1.n.x -> 1.n+1.x upgrade path.

You can find the dependencies used by Temporal server here: https://docs.temporal.io/docs/server-versions-and-dependencies

We generally recommend using helm for dev / test setup up only.

BTW, beginning from 1.5.x (to be released within few days), Kafka / Zookeeper will no long be the hard dependency of temporal.
Ref: https://github.com/temporalio/temporal/pull/988

Wenquan_Xing, thank you very much for the info, much appreciated! Also great plan to eliminate dependencies!

Was news to me that you wouldn’t recommend helm for Prod. What would you recommend to use in Prod then? Because the Docker-compose seems also to be recommended for local standalone use only.

Currently there is no official recommended solution yet.

You can use helm template to generate the necessary k8s yaml files for now.

ok, thanks. I hope you guys don’t take this the wrong way, I just want to show you how I as an outsider experience your documentation, and where I get stuck. Here is one place:

The membership section controls the following membership layer parameters

That’s from the “Configuration Reference”. Now I have no clue what the “membership layer” is, so I would expect to find it in “System Architecture”. But it’s not there. What is there instead is the info that

A common Temporal-based application consists of a Temporal service, Workflow and Activity workers, and external clients.

And the diagram below says there is a “Matching Service” and a “History Service” underneath the frontend. So I go back to the “Configuration Reference” to have a look how that is configured, in the hope to get some indication what they are, and what they’re there for, but I can’t find anything about them there.

So I go back to the next sentence in “System Architecture”:

Note that both types of workers as well as external clients are roles and can be collocated in a single application process if necessary.

And I ask myself, what roles are these (a very widely used term), and why would I or would not want to have them collocated in a single application process, and why could it be necessary or not, and what is an application process anyways in this context? Also not defined in the Glossary.

And just to make it clear, this is just one example of many such questions that come up constantly when I read it, I’m not expecting answers to these particular questions. Please don’t get angry at me, but please take it as a piece of open feedback instead: your documentation is useless for its intended purpose. I get the “Contribute nothing, expect nothing” aspect here, I thought the least I can do is give you feedback through the lens of an infrastructure guy coming in from the outside.

2 Likes

Great feedback, we really appreciate it!

1 Like

Could you pls provide some suggestions on externalizing the Cassandra and ES from the temporal cluster?

Is there any update on the official recommendation for PROD deployment?