Temporal keyword search attribute does not work

Hi!

I have a list of created search attributes:

keys = [
    SearchAttributeKey.for_keyword(key)
    for key in ('inquiryId', 'personId', 'flowId', 'reservationId', 'workspaceId', 'segmentId')
]
await client.operator_service.add_search_attributes(
    AddSearchAttributesRequest(
        namespace=client.namespace,
        search_attributes={
            key.name: IndexedValueType.ValueType(key.indexed_value_type)
            for key in keys_to_create
        },
    ),
)

I create workflow execution with values for these attributes:

await self._temporal_client.start_workflow(
    ExecuteFlowWorkflow.run,
    flow,
    id=f"execute-flow-{uuid4()}",
    task_queue=self._temporal_settings.task_queue,
    search_attributes=TypedSearchAttributes([
        SearchAttributePair(SearchAttributeKey.for_text(f"flowId"), str(flow.id)),
        SearchAttributePair(SearchAttributeKey.for_text(f"personId"), str(person_id)),
        SearchAttributePair(SearchAttributeKey.for_text(f"reservationId"), str(reservation_id)),
        SearchAttributePair(SearchAttributeKey.for_text(f"workspaceId"), str(workspace_id)),
    ])
)

If you print out all of the vorkflow executions, you’ll notice that these attributes have list values, which makes me somewhat suspicious:

async for workflow in temporal_client.list_workflows(query='WorkflowType="ExecuteFlowWorkflow"', page_size=10000):
    if workflow.status != WorkflowExecutionStatus.RUNNING:
        continue
    print(workflow.search_attributes)

# OUTPUT:
{'flowId': ['018fcbab-6642-7b8a-ab59-43ded6b563bb'], 'workspaceId': ['018c1bec-682c-7518-992a-ec875e8efe63'], 'reservationId': ['0190d10c-ba1e-759c-a9a1-590845cb120a'], 'personId': ['0190d10c-b8c7-7ad6-a0ff-cf5cd3d2b0a5']}

And when I try to get workflows with workspaceId = 018c1bec-682c-7518-992a-ec875e8efe63 I get nothing:

async for workflow in temporal_client.list_workflows(query=f'workspaceId="018c1bec-682c-7518-992a-ec875e8efe63"'):
    print(workflow)
# No output

Also it does not work in temporal UI. What I did wrong?

This property is deprecated. Use typed_search_attributes which has the search attributes in a temporalio.common.TypedSearchAttributes collection.

Can you clarify this a bit? What do you mean by “does not work”?

Can you clarify this a bit? What do you mean by “does not work”?

I expect that when I use the filter workspaceId="018c1bec-682c-7518-992a-ec875e8efe63", I’ll get a workflow, but it doesn’t come. If you print typed_search_attributes in the workflow that came to list_workflows without filters, you can see that a workflow with this value exists:

   async for workflow in temporal_client.list_workflows(query='WorkflowType="ExecuteFlowWorkflow"', page_size=10000):
        if workflow.status != WorkflowExecutionStatus.RUNNING:
            continue
        print(workflow.typed_search_attributes)
        # OUTPUT: TypedSearchAttributes(search_attributes=[SearchAttributePair(key=_SearchAttributeKey(_name='BuildIds', _indexed_value_type=<SearchAttributeIndexedValueType.KEYWORD_LIST: 7>, _value_type=typing.Sequence[str]), value=['unversioned', 'unversioned:98eb85ed44ecd36564a464633f4cad02', 'unversioned:001698032a5e2c3a3134b811526ca16a']), SearchAttributePair(key=_SearchAttributeKey(_name='flowId', _indexed_value_type=<SearchAttributeIndexedValueType.KEYWORD: 2>, _value_type=<class 'str'>), value='018fcb83-6364-75c2-96cf-e077851b69dc'), SearchAttributePair(key=_SearchAttributeKey(_name='personId', _indexed_value_type=<SearchAttributeIndexedValueType.KEYWORD: 2>, _value_type=<class 'str'>), value='018fda5f-c5da-71ee-b77c-7bad6364653d'), SearchAttributePair(key=_SearchAttributeKey(_name='reservationId', _indexed_value_type=<SearchAttributeIndexedValueType.KEYWORD: 2>, _value_type=<class 'str'>), value='018fda60-d6f8-7c40-9353-557793e27226'), SearchAttributePair(key=_SearchAttributeKey(_name='workspaceId', _indexed_value_type=<SearchAttributeIndexedValueType.KEYWORD: 2>, _value_type=<class 'str'>), value='018c1bec-682c-7518-992a-ec875e8efe63')])
        # The workspaceId parameter is last 

    async for workflow in temporal_client.list_workflows(query=f'workspaceId="018c1bec-682c-7518-992a-ec875e8efe63"'):
        print("x")
        # 'x' has not been output

Also I can’t get this workflow by UI

This appears no longer related to Python specifically, let me ask general support…

If you look at your executions event history do you see this search attribute defined in the first event (WorkflowExecutionStarted)? Do you get results if you search by other ones you define like flowId or personId?

No, workflow still doesn’t come in the same way.

If you look at your executions event history do you see this search attribute defined in the first event (WorkflowExecutionStarted)?

Yes, here’s a snippet of the following code output

    async for workflow in temporal_client.list_workflows(query='WorkflowType="ExecuteFlowWorkflow"', page_size=10000):
        if workflow.status != WorkflowExecutionStatus.RUNNING:
            continue
        # print(workflow.typed_search_attributes
        print(await temporal_client.get_workflow_handle(workflow.id).fetch_history())
WorkflowHistory(
    workflow_id='execute-flow-a43bb9a9-335e-42af-a9e9-335a2741359f', 
    events=[
        event_id: 1
        event_time {
            seconds: 1721780109
            nanos: 833876433
        }
        event_type: EVENT_TYPE_WORKFLOW_EXECUTION_STARTED
        task_id: 660605408
        workflow_execution_started_event_attributes {
            workflow_type {name: "ExecuteFlowWorkflow"}
            task_queue {name: "execute-flow-task-queue" kind: TASK_QUEUE_KIND_NORMAL}
        }
        input {
            payloads {
                metadata {
                    key: "encoding"
                    value: "json/plain"
                }
                data: <huge_payload>
            }
        }
        workflow_task_timeout {
            seconds: 10
        }
        original_execution_run_id: "77a7e294-2d00-402a-8c9a-b1410f69d92d"
        identity: "9@cdfeda5ed129"
        first_execution_run_id: "77a7e294-2d00-402a-8c9a-b1410f69d92d"
        attempt: 1
        first_workflow_task_backoff {}
        search_attributes {
            indexed_fields {
                key: "workspaceId"
                value {
                    metadata {
                    key: "type"
                    value: "Text"
                    }
                    metadata {
                    key: "encoding"
                    value: "json/plain"
                    }
                    data: ""018c1bec-682c-7518-992a-ec875e8efe63""
                }
            }
            indexed_fields {
                key: "reservationId"
                value {
                    metadata {
                    key: "type"
                    value: "Text"
                    }
                    metadata {
                    key: "encoding"
                    value: "json/plain"
                    }
                    data: ""0190e212-22d2-7816-b79a-6ed3a4d53ca6""
                }
            }
            indexed_fields {
                key: "personId"
                value {
                    metadata {
                    key: "type"
                    value: "Text"
                    }
                    metadata {
                    key: "encoding"
                    value: "json/plain"
                    }
                    data: ""0190e212-1f80-7b7a-bd1e-8d0b30f42314""
                }
            }
            indexed_fields {
                key: "flowId"
                value {
                    metadata {
                    key: "type"
                    value: "Text"
                    }
                    metadata {
                    key: "encoding"
                    value: "json/plain"
                    }
                    data: ""018fcbab-6642-7b8a-ab59-43ded6b563bb""
                }
            }
        }
        workflow_id: "execute-flow-a43bb9a9-335e-42af-a9e9-335a2741359f"
}
....

But the value of each attribute is in double quotes. Can this be the reason?

In my first message I use str(flow.id) as value of attribute. The value of flow.id is UUID so I’m 100% sure there can’t be double quotes in there.

Also I use custom pydantic data convector. Perhaps the quotes are being added there? Do the search attributes pass through this convector?

I looked at what format the data is stored in in elasticsearch.

Query:

GET http://<es_addr>/temporal_visibility_v1_dev/_search
{
	"query": {
		"match_all": {}
	}
}

Response:

{
	"took": 0,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 2591,
			"relation": "eq"
		},
		"max_score": 1.0,
		"hits": [
			{
				"_index": "temporal_visibility_v1_dev",
				"_type": "_doc",
				"_id": "execute-flow-a43bb9a9-335e-42af-a9e9-335a2741359f~77a7e294-2d00-402a-8c9a-b1410f69d92d",
				"_score": 1.0,
				"_source": {
					"ExecutionStatus": "Running",
					"ExecutionTime": "2024-07-24T00:15:09.833876433Z",
					"NamespaceId": "b2151f68-654f-49a2-a996-4039e78296bc",
					"RunId": "77a7e294-2d00-402a-8c9a-b1410f69d92d",
					"StartTime": "2024-07-24T00:15:09.833876433Z",
					"TaskQueue": "execute-flow-task-queue",
					"VisibilityTaskKey": "1~660605411",
					"WorkflowId": "execute-flow-a43bb9a9-335e-42af-a9e9-335a2741359f",
					"WorkflowType": "ExecuteFlowWorkflow",
					"flowId": "018fcbab-6642-7b8a-ab59-43ded6b563bb",
					"personId": "0190e212-1f80-7b7a-bd1e-8d0b30f42314",
					"reservationId": "0190e212-22d2-7816-b79a-6ed3a4d53ca6",
					"workspaceId": "018c1bec-682c-7518-992a-ec875e8efe63"
				}
			},
			{
				"_index": "temporal_visibility_v1_dev",
				"_type": "_doc",
				"_id": "execute-flow-80815c40-e479-4b3d-875f-952f25913258~ec9d20a9-b993-472d-a97b-932d1a71176a",
				"_score": 1.0,
				"_source": {
					"ExecutionStatus": "Running",
					"ExecutionTime": "2024-07-24T00:15:09.790856572Z",
					"NamespaceId": "b2151f68-654f-49a2-a996-4039e78296bc",
					"RunId": "ec9d20a9-b993-472d-a97b-932d1a71176a",
					"StartTime": "2024-07-24T00:15:09.790856572Z",
					"TaskQueue": "execute-flow-task-queue",
					"VisibilityTaskKey": "1~660605406",
					"WorkflowId": "execute-flow-80815c40-e479-4b3d-875f-952f25913258",
					"WorkflowType": "ExecuteFlowWorkflow",
					"flowId": "018fcb83-6364-75c2-96cf-e077851b69dc",
					"personId": "0190e212-1f80-7b7a-bd1e-8d0b30f42314",
					"reservationId": "0190e212-22d2-7816-b79a-6ed3a4d53ca6",
					"workspaceId": "018c1bec-682c-7518-992a-ec875e8efe63"
				}
			},
...

Here they are without the double quotes. I tried searching for workflow by worxpace id, but it didn’t come up in the response
Request:

GET http://<es_addr>/temporal_visibility_v1_dev/_search
{
   "query": {
	 	"match": {
			"workspaceId": "018c1bec-682c-7518-992a-ec875e8efe63"
		}
	 }
}

Response:

{
	"took": 0,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 0,
			"relation": "eq"
		},
		"max_score": null,
		"hits": []
	}
}

I then looked at the index schema and saw that these fields were not present
Request:

GET http://<es_addr>/temporal_visibility_v1_dev

Response:

{
	"temporal_visibility_v1_dev": {
		"aliases": {},
		"mappings": {
			"dynamic": "false",
			"properties": {
				"BatcherNamespace": {
					"type": "keyword"
				},
				"BatcherUser": {
					"type": "keyword"
				},
				"BinaryChecksums": {
					"type": "keyword"
				},
				"BuildIds": {
					"type": "keyword"
				},
				"CloseTime": {
					"type": "date_nanos"
				},
				"ExecutionDuration": {
					"type": "long"
				},
				"ExecutionStatus": {
					"type": "keyword"
				},
				"ExecutionTime": {
					"type": "date_nanos"
				},
				"HistoryLength": {
					"type": "long"
				},
				"HistorySizeBytes": {
					"type": "long"
				},
				"NamespaceId": {
					"type": "keyword"
				},
				"RunId": {
					"type": "keyword"
				},
				"StartTime": {
					"type": "date_nanos"
				},
				"StateTransitionCount": {
					"type": "long"
				},
				"TaskQueue": {
					"type": "keyword"
				},
				"TemporalChangeVersion": {
					"type": "keyword"
				},
				"TemporalNamespaceDivision": {
					"type": "keyword"
				},
				"TemporalSchedulePaused": {
					"type": "boolean"
				},
				"TemporalScheduledById": {
					"type": "keyword"
				},
				"TemporalScheduledStartTime": {
					"type": "date_nanos"
				},
				"WorkflowId": {
					"type": "keyword"
				},
				"WorkflowType": {
					"type": "keyword"
				}
			}
		},
		"settings": {
			"index": {
				"routing": {
					"allocation": {
						"include": {
							"_tier_preference": "data_content"
						}
					}
				},
				"search": {
					"idle": {
						"after": "365d"
					}
				},
				"number_of_shards": "1",
				"auto_expand_replicas": "0-2",
				"provided_name": "temporal_visibility_v1_dev",
				"creation_date": "1721204508335",
				"sort": {
					"field": [
						"CloseTime",
						"StartTime",
						"RunId"
					],
					"missing": [
						"_first",
						"_first",
						"_first"
					],
					"order": [
						"desc",
						"desc",
						"desc"
					]
				},
				"number_of_replicas": "0",
				"uuid": "Zah3i9SIRnShpPJOcq2sdQ",
				"version": {
					"created": "7170999"
				}
			}
		}
	}
}

In my first post you can see how I create search attributes. Do I understand correctly that temporal itself should have added these fields to the index schema in elasticsearch?