Dear Temporal Team,
I am writing to request your assistance with an issue that I am experiencing with my Temporal workflow application.
I have an application with the following Docker-compose setup:
- worker process
- Temporal server
- PostgreSQL which the temporal server depends on
- http server, which, upon request, executes a workflow (without a worker)
When I make HTTP calls to my HTTP server, which registers workflow executions with the Temporal server, I can see the worker process is successfully picking up tasks for execution, and workflow completes successfully.
However, when I point the Temporal server to a remote RDS database (not in the Docker-compose setup, but in AWS RDS), the worker process does not pick up any tasks.
logs from temporal server show:
"level": "error",
"ts": "2023-04-30T14:23:18.121Z",
"msg": "Unable to call matching.PollWorkflowTaskQueue.",
"service": "frontend",
"wf-task-queue-name": "1@2ae85889e8b9-kyc_workflow_task_queue-085d25183cf34be2abfe066f1b1e7478",
"timeout": "1m9.999860541s",
"error": "error reading from server: read tcp 172.18.0.6:48830->172.18.0.6:7235: read: connection reset by peer",
"logging-call-at": "workflow_handler.go:912",
"stacktrace": "go.temporal.io/server/common/log
....
"level": "error",
"ts": "2023-04-30T15:02:32.383Z",
"msg": "Persistent fetch operation Failure",
"shard-id": 4,
"address": "172.18.0.11:7234",
"wf-namespace-id": "030f839d-06a2-4e81-af88-c8e7aa036bfa",
"wf-id": "gh_kyc_workflow_execution_GHA-312251100-4_2hxw4",
"wf-run-id": "86695d5c-ab1f-4e47-9692-e3876982ddee",
"store-operation": "get-wf-execution",
"error": "GetWorkflowExecution: failed to get buffered events. Error: getBufferedEvents operation failed. Select failed: context deadline exceeded",
"logging-call-at": "transaction_impl.go:465",
"stacktrace": "go.temporal.io/server/common/log.
...
since the HTTP server is successfully registering workflow executions, I figured out communication is healthy between temporal server and my remote DB, and my guess that the issue is around some security checks in the communication process between the temporal client and my AWS RDS.
I also verified that task queue is defined the same in the worker process and when executing the workflow (in the http server process).
I would appreciate your assistance in identifying the issue and helping me resolve it so that my worker process can successfully pick up tasks from the Temporal server when using a remote RDS database.
Thank you in advance for your help.
Best regards