Deployment of Temporal

Hi everyone, we’re looking into migrating from Cadence to Temporal and I have some general questions for deploying and upgrading.

  1. We are currently running the default and visibility databases in one database instance. Is this a terrible idea, or does it not matter that much? What would be the reason the separate them? Do they have vastly different performance profiles for example? We are planning to use postgres in AWS RDS.
  2. We are using server-side polling, and I noticed a decent amount of logging for every poll. I see you guys are aware of this. Is there a way to work around this for the moment? One thing I can think of is to configure the logger to the ERROR level, but then we might lose insight into actual problems. We are using the Java SDK.
  3. Is use of the auto-setup image recommended? It would be nice to create the schemas automatically. It saves me a Kubernetes Job/init container that I have to create.
  4. Is there a schema upgrade guide somewhere? I found this page, but it’s not quite clear to me if I have to for example fully undeploy temporal, then manually do the upgrade, and then re-deploy it. The helm chart repo mentions that it will do a rolling update. Does this mean I can upgrade the schema while Temporal is still running the old deployment? This reply from Samar seems to indicate it can.

Love your product :heart:

Hey @BartXZX,
Great to hear that you like the product.

  1. We are currently running the default and visibility databases in one database instance. Is this a terrible idea, or does it not matter that much? What would be the reason the separate them? Do they have vastly different performance profiles for example? We are planning to use postgres in AWS RDS.

Separating out core and visibility databases is good engineering practice from availability and reducing blast radius. We have carefully designed the system where the core engine keeps on functioning so your workflows can keep on making forward progress if the entire visibility database goes down. They also have different performance characteristics as core database is mainly optimized for write heavy workload as oppose to visibility which is all about secondary indexes and power the queries for visibility experience. Ideally you want to power visibility from Elastic Search for production clusters as it is more suited for large scale.

  1. We are using server-side polling , and I noticed a decent amount of logging for every poll. I see you guys are aware of this. Is there a way to work around this for the moment? One thing I can think of is to configure the logger to the ERROR level, but then we might lose insight into actual problems. We are using the Java SDK.

I’ll let @maxim respond to this one.

  1. Is use of the auto-setup image recommended? It would be nice to create the schemas automatically. It saves me a Kubernetes Job/init container that I have to create.

Generally we don’t recommend use of auto-setup image for production clusters. Mainly the way it is implemented could cause problems for large sized clusters. Haven’t said that I don’t see any issues using auto-setup for one node Temporal clusters for small throughput use cases. Currently the recommended practice is to perform schema upgrades out of band before upgrading the Temporal server.

  1. Is there a schema upgrade guide somewhere? I found this page, but it’s not quite clear to me if I have to for example fully undeploy temporal, then manually do the upgrade, and then re-deploy it. The helm chart repo mentions that it will do a rolling update. Does this mean I can upgrade the schema while Temporal is still running the old deployment? This reply from Samar seems to indicate it can.

We provide schema upgrade tool for both Cassandra & MySql which supports versioning. There are lots of Temporal clusters used for production traffic so we never make backwards incompatible. We have explicit tests as part of release pipeline to test upgrades of existing clusters. You should never incur any downtime while upgrading Temporal to newer version of the server. Usually the practice we recommend is to use the schema upgrade tool to apply any schema changes which are needed by the server release before upgrading the Temporal server. We always calls out any other steps which are needed for upgrade as part of release notes.

Hope this helps. Please feel free to reach out to us if you have other questions around migration.

Thank you for the fast response Samar!

Ideally you want to power visibility from Elastic Search for production clusters as it is more suited for large scale.

Does this mean that if we were to run ES, that we don’t need to create the visibility database at all?
I know that ES powers some of the search features, but we are not using them. At least not in the foreseeable future.

Generally we don’t recommend use of auto-setup image for production clusters.

Got it.

Usually the practice we recommend is to use the schema upgrade tool to apply any schema changes which are needed by the server release before upgrading the Temporal server.

So just keep everything running, and apply the schema changes while Temporal is still running.
Would for example the following upgrade plan be a valid one?

  1. Run version 1.4 (for example) of Temporal.
  2. Temporal 1.5 is released.
  3. Keep Temporal deployed.
  4. Manually start the 1.5 admin tools container.
  5. Run the schema upgrade from the admin container.
  6. If successful, remove the admin tools container.
  7. Upgrade the Temporal Helm chart to start the rolling update.
  8. Done.

Does this mean that if we were to run ES, that we don’t need to create the visibility database at all?
I know that ES powers some of the search features, but we are not using them. At least not in the foreseeable future.

Yes. No need to create Visibility DB when using ES.

So just keep everything running, and apply the schema changes while Temporal is still running.
Would for example the following upgrade plan be a valid one?

  1. Run version 1.4 (for example) of Temporal.
  2. Temporal 1.5 is released.
  3. Keep Temporal deployed.
  4. Manually start the 1.5 admin tools container.
  5. Run the schema upgrade from the admin container.
  6. If successful, remove the admin tools container.
  7. Upgrade the Temporal Helm chart to start the rolling update.
  8. Done.

This looks correct for Temporal releases with schema changes. Not all Temporal releases have schema updates in which case you can skip the schema upgrade step.
Another thing to remember is never skip releases doing upgrade. Any additional steps needed to upgrade are carefully documented as part of release notes. So please go over them carefully.

Thank you Samar :slight_smile:

Duly noted :+1: