Temporal-sql-tool no longer in the temporalio/server docker image, should we use temporalio/admin-tools instead?

Gordon_Sun · July 13, 2021, 1:05am

It used to be in 1.8.1

docker run -it --rm temporalio/server:1.8.1 bash
bash-5.0# temporal-sql-tool
NAME:
   temporal-sql-tool - Command line tool for temporal sql operations

USAGE:
   temporal-sql-tool [global options] command [command options] [arguments...]

VERSION:
   0.0.1

COMMANDS:
   setup-schema, setup      setup initial version of sql schema
   update-schema, update    update sql schema to a specific version
   create-database, create  creates a database
   drop-database, drop      drops a database
   help, h                  Shows a list of commands or help for one command

GLOBAL OPTIONS:
   --endpoint value, --ep value            hostname or ip address of sql host to connect to (default: "127.0.0.1") [$SQL_HOST]
   --port value, -p value                  port of sql host to connect to (default: 3306) [$SQL_PORT]
   --user value, -u value                  user name used for authentication when connecting to sql host [$SQL_USER]
   --password value, --pw value            password used for authentication when connecting to sql host [$SQL_PASSWORD]
   --database value, --db value            name of the sql database (default: "temporal") [$SQL_DATABASE]
   --plugin value, --pl value              name of the sql plugin (default: "mysql") [$SQL_PLUGIN]
   --quiet, -q                             Don't set exit status to 1 on error
   --connect-attributes value, --ca value  sql connect attributes [$SQL_CONNECT_ATTRIBUTES]
   --tls                                   enable TLS over sql connection [$SQL_TLS]
   --tls-cert-file value                   sql tls client cert path (tls must be enabled) [$SQL_TLS_CERT_FILE]
   --tls-key-file value                    sql tls client key path (tls must be enabled) [$SQL_TLS_KEY_FILE]
   --tls-ca-file value                     sql tls client ca file (tls must be enabled) [$SQL_TLS_CA_FILE]
   --tls-enable-host-verification          sql tls verify hostname and server cert (tls must be enabled) [$SQL_TLS_ENABLE_HOST_VERIFICATION]
   --help, -h                              show help
   --version, -v                           print the version

But now in 1.10.5

docker run -it --rm temporalio/server:1.10.5 bash
bash-5.0# temporal-sql-tool
bash: temporal-sql-tool: command not found
bash-5.0#

Are we supposed to use temporalio/admin-tools ? I think that is not a good idea. How can we ensure commands executed from temporalio/admin-tools will work with temporalio/server ? We need to make sure they have the same version number. But before we do not need to care about the version number because I’m going onto temporalio/server and of course the temporal-sql-tool result will work with itself.

Thanks

derek · July 13, 2021, 9:45pm

when we upgrade schema, that needs to be done before the new version of temporal-server is deployed. we’d get into a chicken and egg problem with needing the new schema before the new server and not being able to run the new server without the new schema - so there’s a need for these to be decoupled.

additionally, it would be ideal if temporal server containers didn’t have any extra tools / shells / whatever for security purposes.

EDIT: also the answer to your question is - yes, use the proper version of admin tools for your schema needs.

Gordon_Sun · July 13, 2021, 11:33pm

Thank you Derek for the reply but I disagree with some of the points.
I have Temporal running in K8s cluster as deployment and I have init containers for the pod which I use the same temporalio/server docker image, but run temporal-sql-tool. So it is after the schemas got updated then the new server is deployed. I think there is a little down time right after I upgraded the schema and it’s still the old service running.
I will switch to use temporalio/admin-tools but feels a little strange, and it’s one extra image to download for that pod on that node.

derek · July 13, 2021, 11:54pm

Totally understand the convenience of that - you could build your own images that are a combination to maintain that workflow.

We do want to favor a zero downtime upgrade approach though - so doing an online schema upgrade is an important factor.

Developer and operator experience is also very important for sure - would having multiple options of images be interesting or confusing?

Gordon_Sun · July 14, 2021, 12:02am

Could you explain how to exactly can we achieve zero downtime when upgrade?

Gordon_Sun · July 14, 2021, 12:15am

Also just realized that due to some strange setup for temporalio/admin-tools , it is not possible to ask K8s to run that docker image but run the temporal-sql-tool command. We might need to overwrite the entry point or something, still looking for solutions

derek · July 14, 2021, 9:38pm

can you be a little more specific about the behavior you are seeing?

and what you would expect instead?

Gordon_Sun · July 14, 2021, 10:54pm

I expected the tool to be in temporalio/server but now it is only in temporalio/admin-tools
I have successfully switched to use temporalio/admin-tools. I can explain the problem I encountered when do the switch. For temporalio/server I can pass in the following args like this in my CDK code

args: [
  'sh',
  '-c',
  'temporal-sql-tool --ca tx_isolation=READ-COMMITTED -ep $MYSQL_HOST -u $MYSQL_USER -pw $MYSQL_PASSWORD create --db temporal ; true'
]

But for temporalio/admin-tools it has to be like this:

command: [
  'sh'
],
args: [
  '-c',
  'temporal-sql-tool --ca tx_isolation=READ-COMMITTED -ep $MYSQL_HOST -u $MYSQL_USER -pw $MYSQL_PASSWORD create --db temporal ; true'
]

I do not know why

derek · July 15, 2021, 4:30pm

The difference is because the admin tools container is currently set up to run and do nothing until something is exec’d on it…

The old docker file had an entrypoint that literally executed whatever you ran as it’s own command after doing some setup:

This is changed in the recent version of temporal server and only allows certain things to be passed as args and executed rather than arbitrary commands.

github.com

temporalio/temporal/blob/master/docker/entrypoint.sh#L24

    
      
          
          
dockerize -template ./config/config_template.yaml:./config/docker.yaml
          
          
# Automatically setup Temporal Server (databases, Elasticsearch, default namespace) if "autosetup" is passed as an argument.
          for arg in "$@" ; do [[ ${arg} == "autosetup" ]] && ./auto-setup.sh && break ; done
          
          
# Setup Temporal Server in development mode if "develop" is passed as an argument.
          for arg in "$@" ; do [[ ${arg} == "develop" ]] && ./setup-develop.sh && break ; done
          
          
# Run bash instead of Temporal Server if "bash" is passed as an argument (convenient to debug docker image).
          for arg in "$@" ; do [[ ${arg} == "bash" ]] && bash && exit 0 ; done
          
          
./start-temporal.sh

The admin tools container entrypoint runs tail -f /dev/null - the short story is that the behavior you are observing is expected.

Gordon_Sun · July 15, 2021, 10:29pm

Thank you @derek. I’m more interested in the ‘zero downtime upgrade’ you mentioned above. How exactly should we do it?

derek · July 15, 2021, 11:02pm

The normal process would be, while workloads are running:

run schema upgrade tool to upgrade to next schema version
- running temporal services are guaranteed to be forward compatible
once schema upgrade is complete, rollout out new versions of services
- you mentioned k8s - this can be done easily if you use deployments: simply bump the version of the image and apply.

Depending on the number of pods you have running for each service, you may want to tweak initialDelaySeconds for liveness check to put some delay between restarting pods. Can also set max surge and max unavailabe to tune how many replicas are rolled out at once - rollout will take longer but go smoother if a new version is added, takes a minute or so to stabilize, and then an old pod is killed (rinse,repeat).

following this process should give you a smooth upgrade without downtime.

one more note - upgrades must be to monotonically increasing minor versions. do not skip minor versions even if there isn’t a schema upgrade.

Gordon_Sun · July 19, 2021, 10:47pm

@derek , we jumped from 1.8.1 directly to 1.10.5 . Everything seems working fine. Are you saying this is wrong, that we should deploy 1.8.2, then 1.8.3, then … and go through all the versions until we reach 1.10.5 ?

derek · July 19, 2021, 11:34pm

It is recommended to upgrade through minor versions. so if you were on 1.8.x you would upgrade to the latest 1.9.x and then to the latest 1.10.x

My understanding is that we just don’t guarantee that it will work to skip minor versions - I’m not sure if it will always break something.

I want to say that if things are working for you all is well but @Wenquan_Xing or @samar might be better equipped to answer that question.

Gordon_Sun · July 19, 2021, 11:48pm

Got it. Thank you very much @derek. We will try to upgrade more frequently

sagardeo21 · October 27, 2021, 3:36pm

@derek I am trying to create db using the temporal-sql-tool from admin-tools.
For some reason, it is selecting SQL_USER as the host instead of SQL_HOST & errors out with no such host.
Have you encountered this before?

derek · November 16, 2021, 5:24pm

sorry for the very late reply here - i have not noticed that. what version are you using? is it still an issue?

Topic		Replies	Views
Upgrade Guidance - 1.6.3 to 1.9.2 Community Support upgrading	13	2198	May 25, 2021
How to automate the setup when use separated mysql (or any db)? Community Support mysql	14	2364	April 27, 2021
Does Temporal upgrade DB schemas automatically? Community Support	9	1573	April 3, 2023
Install and migrate temporal without helm chart Community Support	7	1629	March 9, 2021
ERROR "sql schema version compatibility check failed: unable to read DB schema version keyspace/database: temporal error: pq: relation "schema_version" does not exis" Community Support	2	4500	February 9, 2023

Temporal-sql-tool no longer in the temporalio/server docker image, should we use temporalio/admin-tools instead?

Related topics