Elasticsearch - ListWorkflowExecutions failed - No mapping found for [CloseTime]

We had deployed Temporal Cluster using Helm charts previously and all been fine. But we now need to move to a phase of work in which we need advanced visibility (to list based on custom search attributes).

Today we attempted to add Elasticsearch as the visibility store and deployed separate pods for Elasticsearch and hooked it all up with the Helm charts. However, nothing we did seemed to overcome the error we kept seeing - ListWorkflowExecutions failed: elastic: Error 400 (Bad Request): all shards failed [type=search_phase_execution_exception], root causes: No mapping found for [CloseTime] in order to sort on [type=query_shard_exception] via the UI or tctl.

We even nuked the deploy and main persistence store (RDS instance of Postgres) as we wondered if there was some indexing issue and we had the luxury to do this as its just greenfield work for now. However no difference.

We searched different places in the forum and elsewhere and tried to look at different things. One suggestion that came up was to manually create the mappings ourself but this seems to be of a maintenance headache going forward?

Querying the /_mapping endpoint we got the following:

{
    "temporal_visibility_v1_dev": {
        "mappings": {
            "properties": {
                "ExecutionStatus": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "ExecutionTime": {
                    "type": "date"
                },
                "NamespaceId": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "RunId": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "StartTime": {
                    "type": "date"
                },
                "TaskQueue": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "VisibilityTaskKey": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "WorkflowId": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "WorkflowType": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }
            }
        }
    }
}

which seemed to be all the Storage mappings you get when you run tctl get-search-attributes..., but none of the default System search attributes (hence why its immediately complaining about CloseTime not being mapped)…

We are using v1.21.1 of Temporal and possibly looking to bump to latest version if we need to try. We are using v8 of Elasticsearch.

Any ideas??

Update - Tried a few things from here - (some of the links don’t quite point to the right places but we figured it out) and now the /_mapping is correct

However, there seems to be some issue with actual persistence of workflows which may or may not be related. Looking into that currently.

So on the persistence layer, it all seems to be fine

current_executions table
 temporal-sys-tq-scanner 
temporal-sys-history-scanner 
 temporal-sys-add-search-attributes-workflow
 onboarding-gkAsSqpFUEcVTb7IMiIo6 (I created this)

and then for Elasticseach the /temporal_visibility_v1_dev/_search seems to give relevant data too:

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 14,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits":rage: {
            "_index": "temporal_visibility_v1_dev",
            "_id": "temporal-sys-history-scanner~8581ea3a-3c09-4288-9c7d-1651f2879eda",
            "_score": 1.0,
            "_source": {
                "ExecutionStatus": "Running",
                "ExecutionTime": "2023-09-16T00:00:00.673916874Z",
                "NamespaceId": "32049b68-7872-4094-8e63-d0dd59896a83",
                "RunId": "8581ea3a-3c09-4288-9c7d-1651f2879eda",
                "StartTime": "2023-09-15T12:00:01.673916874Z",
                "TaskQueue": "temporal-sys-history-scanner-taskqueue-0",
                "VisibilityTaskKey": "216~5242953",
                "WorkflowId": "temporal-sys-history-scanner",
                "WorkflowType": "temporal-sys-history-scanner-workflow"
            }
        },
       ....
    {
        "_index": "temporal_visibility_v1_dev",
        "_id": "onboarding-gkAsSqpFUEcVTb7IMiIo6~3ab22198-fd20-4aaf-a8e9-89cf4fb00b5a",
        "_score": 1.0,
        "_source": {
            "BuildIds": [
                "unversioned",
                "unversioned:@temporalio/worker@1.8.2+dfc0e48fcf9fef5275a9f0336af1ea3398b7f4246c70877a36520a4013f0861c"
            ],
            "ExecutionStatus": "Running",
            "ExecutionTime": "2023-09-15T13:22:59.829190676Z",
            "FeasibilityCheckId": [
                "feasibility-101"
            ],
            "NamespaceId": "4ce75714-847a-4653-b670-0630b698612c",
            "RunId": "3ab22198-fd20-4aaf-a8e9-89cf4fb00b5a",
            "StartTime": "2023-09-15T13:22:59.829190676Z",
            "TaskQueue": "onboarding-local",
            "VisibilityTaskKey": "47~5242892",
            "WorkflowId": "onboarding-gkAsSqpFUEcVTb7IMiIo6",
            "WorkflowType": "onboardingWorkflow"
        }
    },
    ...

but nothing shows on the UI or via tctl when getting a filter list with WorkflowId, ExecutionStatus or my custom search attribute - just shows empty

I checked by completing a workflow I know I had just started via app code - by explicitly getting a handle using the workflow ID and sending a signal etc. All worked fine.

So its definitely now just the list filter via Elasticsearch that is not working as expected (not returning anything using any attribute Ive tried) -via tctl or the UI (which I guess uses the List filter via visibility store).

Im sure this query looks ok tctl --address<our-temporal-cluster>:7233 --ns workflow-service-local workflow l -q "ExecutionStatus='Running'" as I used it before introducing Elasticsearch

Tried some things from this thread

  • tctl workflow describe --workflow_id <workflow_id> gives a result
  • tctl workflow list --query "WorkflowId=<workflow_id>" as explained above doesn’t return anything

I may open a new support message and just link to this one as the issue is around List filtering when having introduced Elasticsearch and the title I have given is not a great one (its what made sense at the time).

This is a big blocker for us so need to get it resolved asap.

Opened List filter after introducing Elasticsearch visibility store not working on Web, binary or Typescript SDK instead so this can be archived