Temporal issue after upgrade 1.13.1 => 1.18.4

Hi all,
Right after upgrading Temporal from 1.13.1 => 1.18.4 with the following step:

  • 1.13.1 → 1.15.2: update schema from 1.6 → 1.7
  • 1.15.2 → 1.18.4: update schema from 1.7 → 1.8

I’m facing the following issues:

1- frontend errors are generating millions of log records

{
error
UnhandledCommand
level
info
logging-call-at
metric_client.go:92
service
frontend
service-error-type
serviceerror.InvalidArgument
ts
2023-04-08T12:44:22.848Z
}

2- this SQL query is called from 5 million to 20 million times

UPDATE
  `executions`
SET
  `db_record_version` = ?,
  `next_event_id` = ?,
  `last_write_version` = ?,
  DATA = ?,
  `data_encoding` = ?,
  `state` = ?,
  `state_encoding` = ?
WHERE
  `shard_id` = ?
  AND `namespace_id` = ?
  AND `workflow_id` = ?
  AND `run_id` = ?

3- Database storage usage is pumped 306G → 377G in 2 days

4- The replication lag sync from master db → read replica is increasing from 0 → 1 day (Seconds_Behind_Master: 87600)

My question:

  1. Why 1 is happening and how to fix it. Is there any issue ?
  2. Why 2 is called from 5 million to 20 million times
  3. Why the database storage usage is pumped

My assumption:

  • The call times of the query are the main reason why the storage is increasing and leading to the replication lag but I’m not sure and of course, I need support from the community :smiley:

I’m using:
GCP MySQL version: 8.0.26
GKE: 1.24.10-gke.2300

Temporal doesn’t support direct updates that skip major versions. Did you perform 1.13.x->1.14.x->1.15.x->1.16.x->1.17.x->1.18.x upgrade?

See the upgrade guide.

hello @maxim.
The steps that we did:

  • 1.13.1 → 1.15.2: update schema from 1.6 → 1.7
  • 1.15.2 → 1.18.4: update schema from 1.7 → 1.8
    Is there something wrong with my steps

Thank you,
Chuong

Yes. You skipped a few versions, and it is not supported.