Architectural understanding with data persistence for visibility with Postgres and ES

Hi,

I’ve gotten Temporal running in my k8s cluster with both an external Postgres and an external Elasticsearch installation and I thought everything was running smoothly.

I have ran through a couple examples in the Go 101 tutorials, and also played with namespaces via the CLI. Workflows don’t show results and namespace deletions end up in a gRPC timeout error.

I can see this in my Elasticsearch database:

I also have no entries in the Postgres “temporal_visibility” table in Postgres.

My question is, how are these two data sources used in Temporal for workflow visibility? Or asked more specifically, is the “temporal_visibility” table needed if Elasticsearch is working?

Should the namespace delete workflow ( temporal-sys-delete-namespace-workflow) still be showing as running, even though the namespace is deleted?

I’m going to play with my Helm chart values to see if I can break my Temporal cluster more or (hopefully) less. Any insights on architecture you can share would be greatly appreciated.

Scott

Interestingly, I have this now in my UI.

image

And I can see this error in my frontend pod.

{“level”:“error”,“ts”:“2024-04-11T07:34:23.280Z”,“msg”:“service failures”,“operation”:“OperatorDeleteNamespace”,“wf-namespace”:“zeus-temp”,“error”:“System Workflow with WorkflowId temporal-sys-delete-namespace-workflow and RunId b26b6963-5d24-430e-b216-0c5dc5c3198b returned an error: context deadline exceeded”

Seems like gRPC isn’t communicating correctly?

Scott

In the meantime, I’ve dug deeper into the helm chart and it seems both the Postgres visibility needs to be present, despite Elasticsearch being available (do correct me if I am wrong).

So, my question turns to, should anything be stored into the Postgres visibility when creating workflows and if yes, what would be stored or rather what is the purpose?

And lastly, is there a known procedure to troubleshoot the context deadline exceeded errors?

Scott