How to start many Temporal servers without Kubernetes

sergle · December 2, 2020, 10:11am

I tried to run second instance of Temporal on another server, and observed strange messages in log.

on host with 192.168.5.7
docker run --rm -d
–network host
–env CASSANDRA_SEEDS=“192.168.5.4,192.168.5.5,192.168.5.6”
–env TEMPORAL_BROADCAST_ADDRESS=192.168.5.7
–env BIND_ON_IP=192.168.5.7
–env NUM_HISTORY_SHARDS=512
temporalio/auto-setup:1.3.2
On host with 192.168.5.8
docker run --rm -d
–network host
–env CASSANDRA_SEEDS=“192.168.5.4,192.168.5.5,192.168.5.6”
–env TEMPORAL_BROADCAST_ADDRESS=192.168.5.8
–env BIND_ON_IP=192.168.5.8
–env NUM_HISTORY_SHARDS=512
temporalio/auto-setup:1.3.2

Then on 1st (192.168.5.7) I observe log with messages, related to different shard-id, for example, 107

{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Close shard","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","logging-call-at":"context_impl.go:809"}
{"level":"error","ts":"2020-12-02T10:00:14.314Z","msg":"Error updating ack level for shard","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"transfer-queue-processor","cluster-name":"active","error":"Failed to update shard.  previous_range_id: 7, columns: (range_id=8)","operation-result":"OperationFailed","logging-call-at":"queueAckMgr.go:224","stacktrace":"go.temporal.io/server/common/log/loggerimpl.(*loggerImpl).Error\n\t/temporal/common/log/loggerimpl/logger.go:138\ngo.temporal.io/server/service/history.(*queueAckMgrImpl).updateQueueAckLevel\n\t/temporal/service/history/queueAckMgr.go:224\ngo.temporal.io/server/service/history.(*queueProcessorBase).processorPump\n\t/temporal/service/history/queueProcessor.go:242"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","component":"shard-controller","address":"192.168.5.7:7234","lifecycle":"Stopping","component":"shard","shard-id":107,"logging-call-at":"controller_impl.go:224"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","lifecycle":"Stopped","component":"shard-item","number":290,"logging-call-at":"controller_impl.go:297"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","lifecycle":"Stopping","component":"shard-engine","logging-call-at":"controller_impl.go:459"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"history-engine","lifecycle":"Stopping","logging-call-at":"historyEngine.go:387"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"transfer-queue-processor","cluster-name":"active","lifecycle":"Stopping","component":"transfer-queue-processor","logging-call-at":"queueProcessor.go:167"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Queue processor pump shut down.","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"transfer-queue-processor","cluster-name":"active","logging-call-at":"queueProcessor.go:250"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Task processor shutdown.","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"transfer-queue-processor","cluster-name":"active","logging-call-at":"taskProcessor.go:145"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"transfer-queue-processor","cluster-name":"active","lifecycle":"Stopped","component":"transfer-queue-processor","logging-call-at":"queueProcessor.go:180"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Timer queue processor pump shutting down.","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"timer-queue-processor","cluster-name":"active","component":"timer-queue-processor","logging-call-at":"timerQueueProcessorBase.go:203"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Timer processor exiting.","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"timer-queue-processor","cluster-name":"active","component":"timer-queue-processor","logging-call-at":"timerQueueProcessorBase.go:204"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Task processor shutdown.","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"timer-queue-processor","cluster-name":"active","component":"timer-queue-processor","logging-call-at":"taskProcessor.go:145"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"Timer queue processor stopped.","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"timer-queue-processor","cluster-name":"active","component":"timer-queue-processor","logging-call-at":"timerQueueProcessorBase.go:184"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","component":"history-engine","lifecycle":"Stopped","logging-call-at":"historyEngine.go:406"}
{"level":"info","ts":"2020-12-02T10:00:14.314Z","msg":"none","service":"history","shard-id":107,"address":"192.168.5.7:7234","shard-item":"0xc000c30180","lifecycle":"Stopped","component":"shard-engine","logging-call-at":"controller_impl.go:462"}

Is it a correct way to run 2nd temporal service?
Are they an expected log messages?

tctl --ad 192.168.5.7:7233 adm cluster describe
{
“supportedClients”: {
“temporal-cli”: “\u003c2.0.0”,
“temporal-go”: “\u003c2.0.0”,
“temporal-java”: “\u003c2.0.0”,
“temporal-server”: “\u003c2.0.0”
},
“serverVersion”: “1.3.2”,
“membershipInfo”: {
“currentHost”: {
“identity”: “192.168.5.7:7233”
},
“reachableMembers”: [
“192.168.5.7:6935”,
“192.168.5.8:6934”,
“192.168.5.8:6939”,
“192.168.5.7:6933”,
“192.168.5.7:6939”,
“192.168.5.8:6935”,
“192.168.5.7:6934”,
“192.168.5.8:6933”
],
“rings”: [
{
“role”: “frontend”,
“memberCount”: 2,
“members”: [
{
“identity”: “192.168.5.7:7233”
},
{
“identity”: “192.168.5.8:7233”
}
]
},
{
“role”: “history”,
“memberCount”: 2,
“members”: [
{
“identity”: “192.168.5.8:7234”
},
{
“identity”: “192.168.5.7:7234”
}
]
},
{
“role”: “matching”,
“memberCount”: 2,
“members”: [
{
“identity”: “192.168.5.7:7235”
},
{
“identity”: “192.168.5.8:7235”
}
]
},
{
“role”: “worker”,
“memberCount”: 2,
“members”: [
{
“identity”: “192.168.5.8:7239”
},
{
“identity”: “192.168.5.7:7239”
}
]
}
]
}
}

sergle · December 2, 2020, 2:32pm

looks like it’s expected. started 3rd server it the same way, and all they are delivering tasks for workers

wes.dawoud · March 31, 2022, 7:15pm

Hi @sergle,
I wonder if you used the same setup in a prod environment.
We have a similar setup and it keeps showing an error log similar to the one you posted

Topic		Replies	Views
Temporalio - Temporal/Server - Overwrite the 127.0.0.1:7233 IP address to something else Community Support	3	5127	September 13, 2024
Docker-compose cluster example Community Support docker	5	1293	November 23, 2021
Temporal Server cannot starts and unable to bootstrap ringpop Community Support ringpop , bootstrap	3	1415	April 7, 2022
Matching service start/stop loop in production deployment Community Support	2	2227	November 5, 2020
Temporal server startup Community Support mysql , server	4	1585	June 22, 2022

How to start many Temporal servers without Kubernetes

Related topics