Upgrade Guidance - 1.6.3 to 1.9.2

Looking for a little bit of guidance regarding updating our temporal-server from 1.6.3 to the latest 1.9.2. It is my understanding the that schema updates are backwards compatible, so if I am reading the release notes properly, since we are already running 1.6.3, I should only need to update the schema to version 1.7.0 and then 1.9.0 (both of which have Schema changes), and then 1.9.2 should work just fine. Am I understanding that correctly?

We ensure that any consecutive versions are compatible in terms of database schema upgrades, 
features, and system behavior, however there is no guarantee that there is compatibility between any 
2 non-consecutive versions. 

However, the above from the documentation makes me think I am not correct? Or do I need to upgrade the schema AND version step by step?

Meaning, do I need to upgrade via Helm from 1.6.3 to 1.6.4 to 1.6.5 to 1.6.6, all the way up to 1.9.2, in order to guarantee all functionality? Do I need to run the postgres-upgrade tool with each version to support this as well, or just the versions I mentioned initially that contain schema changes?

Thanks for any pointers

EDIT: In my temporal database, I see schema_version shows 1.3, so I may need to make a stop at 1.5.0 as well since that includes schema changes?

1.6.x → 1.7.x: first upgrade main DB schema to 1.4, then upgrade to 1.7.x
1.7.x → 1.8.x: just upgrade server
1.8.x → 1.9.x: first upgrade main DB schema to 1.5, then upgrade to 1.9.x

when upgrading the schema, use the pre-build docker image with corresponding release version tags

admin tool docker image: Docker Hub

2 Likes

Thanks for the quick reply - This is helpful. I will report back if any issues!

Unfortunately, upgrading the schema seems to have recreated all our tables, losing our workflows.

Using temporalio/admin-tools:1.7.0, I set all my env vars and then /usr/local/bin/temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned which seemed to start from scratch. Redacting all of the action it took, it looks like this just started migrating us from 0.0 onward.

Schema updated from 0.0 to 1.0
…
Schema updated from 1.0 to 1.1
…
Schema updated from 1.1 to 1.2
…
Schema updated from 1.2 to 1.3
…
Schema updated from 1.3 to 1.4

Updating the schema on temporal_visibility then fails with

---- Executing updates for version 1.2 ----
ALTER TABLE queue ADD message_encoding VARCHAR(16) NOT NULL DEFAULT 'Json';
error executing statement:pq: relation "queue" does not exist

is /usr/local/bin/temporal-sql-tool update-schema the only command executed?

docker run -it --entrypoint "" temporalio/admin-tools:1.7.0 /bin/sh

/etc/temporal # history
0 mkdir /etc/ssl/certs/<cert> -p
1 curl https://s3.amazonaws.com/<path-to-cert> -o /etc/ssl/certs/<cert>
2 export SQL_PLUGIN=postgres
3 export SQL_HOST=<postgres-dns>
4 export SQL_PORT=5432
5 export SQL_USER=<username>
6 export SQL_PASSWORD=<password>
7 export SQL_TLS=true
8 export SQL_TLS_ENABLE_HOST_VERIFICATION=false
9 export SQL_TLS_CA_FILE=/etc/ssl/certs/<cert>
10 SQL_DATABASE=temporal /usr/local/bin/temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned -v 1.7 -y

The above had an error of:

error listing schema dir:version dir not found for target version 1.7

but I also should have noticed it seemingly wanted to start from scratch with:

Setting initial schema version to 0.0

I removed:

-v 1.7 -y

and ran

11 SQL_DATABASE=temporal /usr/local/bin/temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned
12 SQL_DATABASE=temporal_visibility /usr/local/bin/temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned

you should only run /usr/local/bin/temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned

1.7 is the Temporal server version, not the schema version.
plz only use the above command

The -y made it only a dry run though, right?

UpdateSchemeTask started, config=&{DBName: TargetVersion:1.7 SchemaDir:/etc/temporal/schema/postgresql/v96/temporal/versioned IsDryRun:true}

the dry run option is really misleading and we removed it:

Ah, ok, I found the -y for dry-run living on here

So, to make sure I am understanding correctly, when I ran:

/usr/local/bin/temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned -v 1.7 -y

I actually attempted to upgrade the schema from 1.2 to 1.7, because the -y for dry-run is broken/removed. And by doing so, that was not a compatible version hop so the migration started the db over from scratch?

basically “dry run” option is not really dry run, it will actually create a database and perform schema updates (legacy from cadence). since this option is confusing, we removed it in later release.

so to sum up, plz only use
temporal-sql-tool update-schema -d /etc/temporal/schema/postgresql/v96/temporal/versioned

ref:

Thanks for the detail.
I will follow up here if I have further issues repairing this in the current env or migrating our other envs.

The rest of the upgrade went as expected, thank you