How can i scale my temporal on EC2?

my Server is running fine on 1 EC2 instance and my Client UI is able to access it. (i am running auto-setup that comes default on temporal github)

and why does my workflow stays in the state of running, the behaviour is very random, some workflow gets completed in an instant some takes 2 minutes or sometimes never! even when they are ran parallelly. is this has to with EC2?

how do i scale on multiple instance, so in the case of one instance going down, other instance is able to continue the workflow. is it possible on EC2.
thank you.

Do you have access to temporal server logs (if you are using auto-setup image, which btw is not really recommended for anything but testing, you would have all service logs together). Check for an frontend service errors maybe. Do you have server metrics enabled and are scraping them?

why does my workflow stays in the state of running, the behavior is very random, some workflow gets completed in an instant some takes 2 minutes or sometimes never.

How do you deploy your worker(s), can you show workflow history one of of these workflows that take longer to complete / never complete?

how do i scale on multiple instance, so in the case of one instance going down, other instance is able to continue

You would deploy multiple worker processes (your workers that you create and run your workflow code). Is this the question?

OK so please ignore the question, since we decided we are not going forward with EC2.

we tried running your github link which you shared github repo

and followed the steps

docker network create temporal-network
docker compose -f docker-compose-postgres.yml -f docker-compose-services.yml up

but the containers for services and admin-tools keep on restarting.

its throwing following error, (i just copied sample, let me know if you need full log)

temporal-worker | Unable to start server. Error: could not build arguments for function “go.uber.org/fx”.(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function “go.temporal.io/server/temporal”.glob…func8 (/home/builder/temporal/temporal/fx.go:921): failed to build log.Logger: received non-nil error from function “go.temporal.io/server/temporal”.ServerOptionsProvider (/home/builder/temporal/temporal/fx.go:163): sql schema version compatibility check failed: unable to read DB schema version keyspace/database: temporal error: pq: relation “schema_version” does not
exist

temporal-worker | [Fx] PROVIDE *pprof.PProfInitializerImpl <= go.temporal.io/server/common/pprof.NewInitializer()
temporal-worker | [Fx] PROVIDE *temporal.ServerImpl <= go.temporal.io/server/temporal.NewServerFxImpl()
temporal-worker | [Fx] PROVIDE temporal.Server <= go.temporal.io/server/temporal.glob..func9()
temporal-worker | [Fx] SUPPLY temporal.ServerOption

for all services.

please tell me if i am missing some steps? i thought this was supposed to run out of the box.
thanks for your help.

question2, how do i run it, if i want to run it for only cassandra?

Not able to reproduce the issue locally. Are you getting this after cleaning up docker (system, volume prune). For cassandra I assume you just want to connect to existing cassandra instance, not run it in container like done here for example, can you confirm?

i am testing it on windows env. to check if this could be an issue with my system i ran it on another system but it gave same errors.

I cleaned up everything, recloned your repo and ran again with following commands

docker network create temporal-network
docker compose -f docker-compose-postgres.yml -f docker-compose-services.yml up

all 4 services + admin-tools just keeps restarting.

give following errors
temporal-admin-tools | exec /etc/temporal/setup.sh: no such file or directory
.
.
temporal-postgresql | 2022-12-05 11:15:07.609 UTC [47] ERROR: relation “schema_version” does not exist at character 26
temporal-postgresql | 2022-12-05 11:15:07.609 UTC [47] STATEMENT: SELECT curr_version from schema_version where version_partition=0 and db_name=

.
.
.
temporal-frontend | Unable to start server. Error: could not build arguments for function “go.uber.org/fx”.(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to
build fxevent.Logger: could not build arguments for function “go.temporal.io/server/temporal”.glob…func8 (/home/builder/temporal/temporal/fx.go:921): failed to build log.Logger: received non-nil error from function “go.temporal.io/server/temporal”.ServerOptionsProvider (/home/builder/temporal/temporal/fx.go:163): sql schema version compatibility check failed: unable to read DB schema version keyspace/database: temporal error: pq: relation “schema_version” does not exist
.
.
.

temporal-frontend2 | [Fx] ERROR Failed to start: could not build arguments
for function “go.uber.org/fx”.(*App).constructCustomLogger.func2 (/go/pkg/mod/go.uber.org/fx@v1.17.1/app.go:415): failed to build fxevent.Logger: could not build arguments for function “go.temporal.io/server/temporal”.glob…func8
(/home/builder/temporal/temporal/fx.go:921): failed to build log.Logger: received non-nil error from function “go.temporal.io/server/temporal”.ServerOptionsProvider (/home/builder/temporal/temporal/fx.go:163): sql schema version cere version_partition=0 and db_name=$1
.
.
.

i am absolutely new to Temporal, if that will help you.
am i missing a step? are there prerequisite? do i have to build schema or something?

Thanks for the info, ok let’s try to figure out whats going on.

temporal-admin-tools | exec /etc/temporal/setup.sh: no such file or directory

Service definition for admin-tools sets a custom script as a volume here. Wonder why that would be failing, this script is in the repo.

Could you bash into your admin-tools container and see if its really not there or at some other location maybe?

docker ps
(find the container id for admin-tools image)
docker exec -it <container_id> bash
cd /etc/temporal/

and see if it was moved there. If not then we need to see why, maybe some sort of permissions issue?
If the script is there check its permissions and try to run it manually in admin-tools container:

./etc/temporal/setup.sh
or 
bash /etc/temporal/setup.sh

also if you are not on latest of master branch please update your local repo, added some updates for nginx support recently so let’s make sure we are both on latest in repo

Hey

The admin-tools container keeps restarting.

Error response from daemon: Container a0d19fd05b3b2758699bbf616e3e9c83df383b460b252642de29f51f5c44ad90 is restarting, wait until the container is running

i havent moved any files, setup.sh is in /script folder.

do i have to make/create/move or rename any directory or files

Can you try seeing its logs:

docker logs <container_id>

I’m not sure how relative paths and volume work on windows honestly, specifically
./script/setup.sh here. Seeing stuff like this and wonder if that can be some sort of path / permission problem or not.

I had the same issue trying to get it working on a Windows machine. Converting the file my-temporal-dockercompose\script\setup.sh from DOS to Unix format fixed it for me. see also
You can for example use dos2unix setup.sh (in WSL) or Notepad++: Edit → EOL Conversion → Unix.

1 Like