Connection refused due to TransportError

I’m running temporal in a t2.micro instance in aws. It appears that my environment is fine since I’m able to connect from my computer to the aws temporal cluster using the temporal CLI. However, when I try to connect from typescript, I get:

2023-12-28T15:07:47.057Z [INFO] webpack 5.89.0 compiled successfully in 417 ms
2023-12-28T15:07:47.059Z [INFO] Workflow bundle created { size: '0.83MB' }

/home/user/project/node_modules/@temporalio/worker/src/connection.ts:51
        throw new TransportError(err.message);
              ^
TransportError: tonic::transport::Error(Transport, hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })))
    at Function.connect (/home/user/project/node_modules/@temporalio/worker/src/connection.ts:51:15)
    at Function.create (/home/user/project/node_modules/@temporalio/worker/src/worker.ts:455:47)

These are my TS connection options:

await Connection.connect({
  address: `genia-temporal:7233`,
  tls: {
    serverNameOverride: 'genia-temporal',
    serverRootCACertificate: await fs.readFile('/home/user/infra/certs/ca.cert'),
    clientCertPair: {
      crt: await fs.readFile('/home/user/infra/certs/client.pem'),
      key: await fs.readFile('/home/user/infra/certs/client.key'),
    },
  },
});

What I’ve done:

  • I’ve generated self-signed certificates as shown here.
  • I’ve deployed temporal using docker compose, and passing the following env vars to the temporal container:
      - "DB=postgres12"
      - "DB_PORT=5432"
      - "POSTGRES_USER=temporal"
      - "POSTGRES_PWD=${POSTGRES_ROOT_PASSWORD}"
      - "POSTGRES_SEEDS=postgresql"
      - "DYNAMIC_CONFIG_FILE_PATH=config/dynamicconfig/development-sql.yaml"
      - "TEMPORAL_TLS_SERVER_CA_CERT=${TEMPORAL_TLS_CERTS_DIR}/ca.cert"
      - "TEMPORAL_TLS_SERVER_CERT=${TEMPORAL_TLS_CERTS_DIR}/cluster.pem"
      - "TEMPORAL_TLS_SERVER_KEY=${TEMPORAL_TLS_CERTS_DIR}/cluster.key"
      - "TEMPORAL_TLS_REQUIRE_CLIENT_AUTH=true"
      - "TEMPORAL_TLS_FRONTEND_CERT=${TEMPORAL_TLS_CERTS_DIR}/cluster.pem"
      - "TEMPORAL_TLS_FRONTEND_KEY=${TEMPORAL_TLS_CERTS_DIR}/cluster.key"
      - "TEMPORAL_TLS_CLIENT1_CA_CERT=${TEMPORAL_TLS_CERTS_DIR}/ca.cert"
      - "TEMPORAL_TLS_INTERNODE_SERVER_NAME=genia-temporal"
      - "TEMPORAL_TLS_FRONTEND_SERVER_NAME=genia-temporal"
      - "TEMPORAL_TLS_FRONTEND_DISABLE_HOST_VERIFICATION=true"
      - "TEMPORAL_TLS_INTERNODE_DISABLE_HOST_VERIFICATION=true"
      - "TEMPORAL_CLI_ADDRESS=temporal:7233" # used by tctl. Will be deprecated
      - "TEMPORAL_CLI_TLS_CA=${TEMPORAL_TLS_CERTS_DIR}/ca.cert"
      - "TEMPORAL_CLI_TLS_CERT=${TEMPORAL_TLS_CERTS_DIR}/cluster.pem"
      - "TEMPORAL_CLI_TLS_KEY=${TEMPORAL_TLS_CERTS_DIR}/cluster.key"
      - "TEMPORAL_CLI_TLS_ENABLE_HOST_VERIFICATION=false"
      - "TEMPORAL_CLI_TLS_SERVER_NAME=genia-temporal"
      - "TEMPORAL_ADDRESS=temporal:7233" # used by Temporal CLI
      - "TEMPORAL_TLS_CA=${TEMPORAL_TLS_CERTS_DIR}/ca.cert"
      - "TEMPORAL_TLS_CERT=${TEMPORAL_TLS_CERTS_DIR}/cluster.pem"
      - "TEMPORAL_TLS_KEY=${TEMPORAL_TLS_CERTS_DIR}/cluster.key"
      - "TEMPORAL_TLS_ENABLE_HOST_VERIFICATION=false"
      - "TEMPORAL_TLS_SERVER_NAME=genia-temporal"

What I’ve checked:

  • I connected to the server on port 7233 via telnet and it works fine.
  • I verified the handshake without errors using
    openssl s_client -connect genia-temporal:7233 -showcerts -cert client.pem -key client.key -CAfile ca.cert -tls1_2
  • I checked the logs with LOG_LEVEL=debug on the temporal cluster but I see no errors.
  • I tried to disable the client’s certificate hostname verification (as I’m connecting from my home and the client’s certificate cn/alt name is localhost and it won’t resolve to my ip address on the aws instance)
      - "TEMPORAL_TLS_FRONTEND_DISABLE_HOST_VERIFICATION=true"
      - "TEMPORAL_TLS_INTERNODE_DISABLE_HOST_VERIFICATION=true"
  • Tried to connect using temporal CLI and it appears to work fine:
temporal operator cluster health --address genia-temporal:7233 --tls-cert-path client.pem --tls-key-path client.key --tls-ca-path ca.cert
temporal.api.workflowservice.v1.WorkflowService: SERVING

Any ideas on how can I continue troubleshooting this? Thanks!

I just tried to connect to the cluster using the Python SDK and the same certificates and it works fine.

I’m using:

node: v18.18.2
@temporalio/activity: 1.8.6
@temporalio/client: 1.8.6
@temporalio/common: 1.8.6
@temporalio/worker: 1.8.6
@temporalio/workflow: 1.8.6

My bad. I somehow forgot to pass the NativeConnection instance to the Worker.create factory method when constructing the worker. It turns out the connection property is optional and it defaults to localhost:7233, so I was connecting to the wrong cluster :man_facepalming:.

It’d be nice to have a more descriptive error to know what went wrong since the default one doesn’t give much detail.

TransportError: tonic::transport::Error(Transport, hyper::Error(Connect, ConnectError("tcp connect error", Os { code: 111, kind: ConnectionRefused, message: "Connection refused" })))

Well, it’s pretty self-explanatory but still took me some time to figure out, as I was obviously expecting the problem to be somewhere else lol.