Network policies to run multiple Temporal servers in the same Kubernetes cluster?

Hello -

For the time being, we are running multiple Temporal servers in the same Kubernetes cluster. We understand this is not recommended (Temporal Server self-hosted production deployment | Temporal Documentation) and if we need to do it, it’s recommended that we use different ports entirely.

Rather than deploying different releases of Temporal with different ports (this would require us to coordinate and lease out ports for different releases, which is an entirely different problem to solve), is it possible instead to use Kubernetes network policies to prevent cross-namespace frontend service discovery? This would make it a lot easier to run Temporal in a multi-tenant manner.

yes this is possible and should totally work - with the caveat that i haven’t tried it yet. if i were going to set up multiple temporal clusters in one k8s cluster i would start with network policy.

if you do attempt this we would love to hear about your progress!

hey derek - thanks for the response. so we found that temporal stores some membership metadata in a database table. we are sharing the same database across all our dev deployments (which we are moving away from soon) - so all dev releases could see each other, seemingly based on what was in the database.

what i’d like to learn more about is how temporal is really doing service discovery. is it correct to say that the following is happening?

  • members post their port to the cluster_metadata table
  • service discovery finds these ports
  • communication between members happens over the membership port; for the frontend, that’s 6933

of course, because we were sharing the database, different temporal servers were seeing each other based on what was in the single cluster_metadata table. but let’s say we weren’t sharing a database. if this is an accurate model, it seems like blocking egress 6933 on the frontend pods would be sufficient - is that right? because a) having a different database per temporal server resolves the discovery via a single table and b) in the case that pod restarted and we coincidentally re-used a previously-used IP that still exists in the table, temporal wouldn’t be able to reach out to that other pod due to the network policy

can i get a gut check on this thought process?

Temporal does not support sharing the same database. It is not only related to the cluster_metadata table but to all other tables which are not designed for sharing. You have to create different databases if you want to share the same physical server.

1 Like

Maxim is correct - you can use the same server but for SQL persistence it’ll need to be a different database and for cassandra it’ll need to be a different keyspace on the same server.

Some of our internal testing includes a shared servers with different database/keyspace, but we still have full network separation between the temporal binaries.

I’m not as deeply familiar with our discovery and membership implementation to really answer on which direction on which ports for which services would be enough to block though. I think you are moving in the right direction but it’ll probably be some trial and error.

Is there a reason it’s harder to start with blocking all traffic between kubernetes namespaces with logical temporal clusters running in them?