Sluggish UI and High CPU

Hello,

We are using mysqldb version 8 and Temporal app version 1.28.0 running on AKS. Everything is Azure based

Our mysql server is running 4vcores, 8g memory, 4000 iops. We have done performance testing where we can handle more than 1 million workflows/day. The database handles this pretty well.

Our issue is when opening the UI. It’s very sluggish, takes a few minutes to load the dashboard (when selecting the namespace), and causes our DB CPU to spike to 99% for as long as the user has the UI open. It seems to be due to the query that is doing a look up of all workflows. We’ve tried scaling the DB, however, CPU still spikes very high and the UI is still very slow. We’ve tried creating indexes, but it doesnt seem to help.

We did add two custom attributes. Not sure if that messes with the queries. Essesntially, the more workflows we have, the more likely the UI will not open and stress the DB.

Any recommendations here?

Thanks

Hi,

Aer you running the Visibility API also on the same mysql server? The UI on the main view lists workflow executions which goes via the Visibility API. There are a few things you could try to get to the bottom of this.

  • Use the observability tools for your DB to figure out where the contention is? Is there a particular query which is so slow?
  • Try separating the Visibility API into a second DB or even better an Elasticsearch instance. Even if you in the long run don’t want to do this, it might help isolating the problem.
  • If you can try without the custom search attributes and see whether it makes a difference.

The recommended setup for Temporal is to use Elasticsearch as Visibility backend, but even without a wait time of minutes sounds like something is going wrong somewhere.

–Hardy

Hey Hardy,

Thanks for the reply.

We are running the Visibility API on the same mysql server.

When looking at the slow queries, we see this same query hitting 40+ connections:

SELECT ev., csa.
FROM executions_visibility ev
LEFT JOIN custom_search_attributes csa ON ev.execution_id = csa.execution_id
WHERE ev.namespace_id = ‘namespace’
ORDER BY ev.start_time DESC
LIMIT 100;

We do have a need for the customer search attributes, so would be nice to keep that feature.

So, we could spin up another DB Server and just run Temporal_Visibility over there?

Thanks

1 Like

Hi,

When looking at the slow queries, we see this same query hitting 40+ connections:

SELECT ev., csa.
FROM executions_visibility ev
LEFT JOIN custom_search_attributes csa ON ev.execution_id = csa.execution_id
WHERE ev.namespace_id = ‘namespace’
ORDER BY ev.start_time DESC
LIMIT 100;

Ok, that would be an indicator that it’s the combination of using MySQL as visibility store in conjunction with custom search attributes. If one wants to dig deeper, it could be interesting to test how things behave without custom search attributes. That said, I am not familiar with using MySQL and what one can expect performance wise using MySQL for Visibility and custom search attributes.

We are using also using custom search attributes, however with an Elasticsearch Visibility backend. I believe this will always be the more performant option. Not sure whether this is an option for you.

So, we could spin up another DB Server and just run Temporal_Visibility over there?

Yes, the Visibility has its own configuration section in the Temporal config and you can switch between different backend types as well as the actual connection parameters.

From the docs

You can use any combination of the supported databases for your Persistence and Visibility stores. For updates, check Server release notes.

You need to sort out how to create the visibility schema in the target DB. I guess you can do this manually or use some of the Temporal tools to do so. It depends what Helm chart or other setup you are using.

–Hardy

1 Like

That’s a really straight forward SQL query. There is no way a 4vcpu 8gig mysql server can’t deal with that in <10ms times. I suspect there is an index missing or something and I’d take that statement as is and just run an explain on it. I almost guarantee it’s doing a full table scan.

2 Likes

Enabling Elasticsearch is something I want to test. Can you help me understand how ypu got yours set up? Did you deploy your own stand-alone instance of ES and then just add the proper configs in your helm? Assuming you’re doing helm deploys like we are.

We are fully azure based, so am still trying to understand the ES resources available there.

Thanks

Hi,

Enabling Elasticsearch is something I want to test. Can you help me understand how ypu got yours set up? Did you deploy your own stand-alone instance of ES

Correct. You setup your own Elasticsearch instance, either self-hosted as well or you use a cloud offering. We do the former. What Temporal fully supports in terms of cloud hosted Elasticsearch is a bit unclear, in particular AWS OpenSearch (it does work though).

and then just add the proper configs in your helm? Assuming you’re doing helm deploys like we are.

We have our own Helm charts which are derived from the upstream ones, but yes we configure the Elasticsearch endpoint, authentication, etc as Helm values which are then injected into the Temporal configuration. The latter is really what matters. How one gets the configuration into the Temporal configuration or ConfigMap if you deploy into Kubernetes really depends on the tools one used. An example config is in the docs.

We also have a self written Helm pre-install hook which for each Temporal upgrade applies any Elasticsearch visibility migrations if needed. Example schema for Temporal v1.29.0 can be found [here]( temporal/schema/elasticsearch/visibility/versioned/v9 at v1.29.0 · temporalio/temporal · GitHub ).

We are fully azure based, so am still trying to understand the ES resources available there.

Pretty sure Azure offers Elastisearch as well, but I don’t have any experience with it.

–Hardy