Temporal in Production - Support and Upgrade

stephan · December 21, 2020, 6:54pm

Hi!

Am a bit lost with finding information about which versions of the underlying technologies are needed, and what the upgrade path and support span is. Is there maybe sort of a matrix for that?

As example: Which versions of Elasticsearch are required? What happens if Elasticsearch releases upgrades that might be incompatible with Temporal? What is the upgrade path in such a case? Would you provide instructions/tools for doing the mapping/schema upgrades or reindexes between major versions of Elasticsearch? How soon would you provide these? Asking because that would decide whether we need a separate Elasticsearch cluster to not be blocked with other applications on the same cluster. Also conversely, how long would you support older Elasticsearch versions with Temporal?

So pretty much this set of questions as far as applicable, but for all underlying infrastructure, like MySQL, Kafka, Zookeeper, Grafana, etc.

I wouldn’t feel comfortable just relying on whatever gets delivered with the “batteries included” Helm chart, because I’m not really sure how the underlying 3rd parties the Helm chart sucks in handle all of this. Upgrading MySQL or Elasticsearch usually involves for us a measure of application code adjustments, and I fear we might end up with a logjam if we have little control over these underlying building blocks. Or would you expect all of this to happen seamlessly with future versions of Temporal and its Helm chart and I’m overthinking this?

Thanks
Stephan

Wenquan_Xing · December 21, 2020, 7:20pm

Hi

Temporal server only support 1.n.x -> 1.n+1.x upgrade path.

You can find the dependencies used by Temporal server here: https://docs.temporal.io/docs/server-versions-and-dependencies

We generally recommend using helm for dev / test setup up only.

Wenquan_Xing · December 21, 2020, 7:24pm

BTW, beginning from 1.5.x (to be released within few days), Kafka / Zookeeper will no long be the hard dependency of temporal.
Ref: https://github.com/temporalio/temporal/pull/988

stephan · December 22, 2020, 9:36am

Wenquan_Xing, thank you very much for the info, much appreciated! Also great plan to eliminate dependencies!

Was news to me that you wouldn’t recommend helm for Prod. What would you recommend to use in Prod then? Because the Docker-compose seems also to be recommended for local standalone use only.

Wenquan_Xing · December 22, 2020, 10:36pm

Currently there is no official recommended solution yet.

You can use helm template to generate the necessary k8s yaml files for now.

stephan · December 23, 2020, 5:50pm

ok, thanks. I hope you guys don’t take this the wrong way, I just want to show you how I as an outsider experience your documentation, and where I get stuck. Here is one place:

The membership section controls the following membership layer parameters

That’s from the “Configuration Reference”. Now I have no clue what the “membership layer” is, so I would expect to find it in “System Architecture”. But it’s not there. What is there instead is the info that

A common Temporal-based application consists of a Temporal service, Workflow and Activity workers, and external clients.

And the diagram below says there is a “Matching Service” and a “History Service” underneath the frontend. So I go back to the “Configuration Reference” to have a look how that is configured, in the hope to get some indication what they are, and what they’re there for, but I can’t find anything about them there.

So I go back to the next sentence in “System Architecture”:

Note that both types of workers as well as external clients are roles and can be collocated in a single application process if necessary.

And I ask myself, what roles are these (a very widely used term), and why would I or would not want to have them collocated in a single application process, and why could it be necessary or not, and what is an application process anyways in this context? Also not defined in the Glossary.

And just to make it clear, this is just one example of many such questions that come up constantly when I read it, I’m not expecting answers to these particular questions. Please don’t get angry at me, but please take it as a piece of open feedback instead: your documentation is useless for its intended purpose. I get the “Contribute nothing, expect nothing” aspect here, I thought the least I can do is give you feedback through the lens of an infrastructure guy coming in from the outside.

maxim · December 23, 2020, 5:57pm

Great feedback, we really appreciate it!

Vinita_Peter · June 7, 2023, 12:31pm

Could you pls provide some suggestions on externalizing the Cassandra and ES from the temporal cluster?

Vinita_Peter · June 7, 2023, 12:32pm

Is there any update on the official recommendation for PROD deployment?

Topic		Replies	Views
Temporal production deployment Community Support	2	878	March 19, 2021
Upgrading Temporal service with helm Community Support java-sdk , helm	3	1324	January 6, 2021
Deployment of Temporal Community Support helm , database	5	3220	April 18, 2022
Evaluating Temporal from helm chart Community Support	5	967	August 14, 2020
Install and migrate temporal without helm chart Community Support	7	1617	March 9, 2021

Temporal in Production - Support and Upgrade

Related topics