Frontend service continuously spewing errors

Hello @samar,

Let me first thank you for the short Zoom call you offered. This definitely put me on the right track!

For the people after me, the following could be helpful.

During the call, Samar identified that although I reported problems with the frontend service, it actually was the history service which had MySQL errors. In my setup, I run on GKE with the Google Cloud SQL Proxy for MySQL as a sidecar container. Initially, we suspected this proxy was at fault.

When looking more closely at the logs of the history container, I noticed this line:

[mysql] 2020/07/30 20:37:51 packets.go:33: unexpected EOF

After a search, I came to this ticket:

The issue at hand is that the application wants to use an idle connection which seems to be closed by the other end. That is possible because the default MaxConnLifetime is infinite.

The comments advice to configure your connection pool. This is what I added to the YAML configuration for the Temporal services, under persistence.datastores.default.sql and persistence.datastores.visibility.sql:

                maxConns: 20
                maxIdleConns: 10
                maxConnLifetime: "1h"

After a redeploy, everything seems to be running fine so far. Where my client app timed out after 60 seconds, it now completes in 1.2 seconds.

A glance at the docs and the helm chart do not specify these SQL settings:

To conclude: always configure the connection pool for MySQL/PostgreSQL in a production envrionment!

6 Likes