Indexing workflows for search / filtering

mrsaints · August 18, 2020, 11:03am

This is not quite a support topic. I just wanted to bounce some ideas around, and figured others may have similar questions or ideas.

To search / filter workflows, the current recommendation if I understand correctly is to use a Temporal set-up with Kafka, and Elasticsearch.

This makes sense if you are working with high traffic applications considering both Kafka, and Elasticsearch were built for scale.

Now, for simpler use-cases, and/or lower traffic applications, this feels a bit overkill. Both Kafka, and Elasticsearch are not readily available in all cloud providers, and they can be quite costly to run (with or without a managed service provider).

There seems to be a few longer term options:

Possibly removing Kafka (Replace Kafka with internal visibility tasks queue by briancecker · Pull Request #295 · temporalio/temporal · GitHub)
Support searching / filtering (search attributes) with the SQL persistence as well (Postgresql can handle this quite well)
Similar to the previous point, supporting more search engines (e.g. MeiliSearch)

And for the short-term, it seems like the suggestion from what I have seen is to essentially index the workflow through activities. i.e. Making it a part of your domain or building a dedicated service to capture / index workflows.

Arguably, baking workflows into a specific domain however, does not always feel like the right solution. Workflows can arguably cross various domain boundaries, and I feel the beauty of it is that you can focus on the overall / high-level business logic (orchestration logic).
And building a dedicated service to capture workflows feels like boilerplate which a workflow engine like Temporal aims to reduce.

I’d be interested in knowing what y’all think, and how others are approaching this.

mrsaints · August 18, 2020, 11:03am

In hindsight, this may have been better posted at https://community.temporal.io/c/discuss/7

maxim · August 18, 2020, 4:52pm

Moved to discuss category.

We do plant implement the following feature requests at some point:

Remove Kafka
Support advanced search attributes with MySQL/PostgreSQL.

The open question is what should we do in the future with workflow indexing.
The current approach is that Temporal has its own SQL like predicate parser and converts queries to a technology-specific format. Initially, it was implemented as ElasticSearch didn’t provide an open source SQL parser support.
I’m not sure if we should stay with this approach and extend the parser every time we get a new storage engine. This would allow keeping predicates technology independent. At the same time, it is kind of limiting as it would reduce all the existing indexing engines to the smallest common denominator.

Another option is to perform only the ingestion part and let everyone query the data store directly. But it has all sorts of complexities, especially around multi-tenant clusters. For example, the predicate parser adds namespace to every query predicate automatically.

Victor_Zhou · February 13, 2025, 7:25pm

Thanks @maxim. is there any update on this feature?

maxim · February 13, 2025, 7:59pm

So we removed Kafka and added support for SQL backends.

We didn’t do any major changes to the user facing features.

Topic		Replies	Views
Beginner Question : Using Elastic Search Community Support elasticsearch	5	2862	August 16, 2021
How to set up ElasticSearch? Community Support elasticsearch	2	3666	July 6, 2020
Temporal and Kafka Community Support java-sdk , task-queue , kafka	10	14016	June 21, 2023
Will temporal fit my scenario Community Support	7	70	February 5, 2025
DB migration support for Temporal deployment Server Deployment elasticsearch , cassandra , postgresql	4	1191	April 1, 2023

Indexing workflows for search / filtering

Related topics