Kafka role in temporal

  1. Had another query. What exactly is the role of kafka while setting up temporal? From what i understand it is used to publish messages to elastic search. Is that flow being deprecated to directly send data to ES? Is there any other kafka use case? can i run elastic search enabled and kafka disabled using helm?
    Secondly if i specify an external elastic search in my values.yaml while using helm, it still creates 3 ES pods. Is this expected behavior?
    Finally, im a little confused regarding config and wanted some guidance there.
    I want to use our internal hosted mysql, ES and kafka to start temporal values.yaml, currently my config looks like this:
server:
  kafka:
    host: <kafka broker ip>
  config:
    persistence:
      default:
        driver: "sql"

        sql:
          driver: "mysql"
          host: _HOST_
          port: 3306
          database: temporal
          user: _USERNAME_
          password: _PASSWORD_

      visibility:
        driver: "sql"

        sql:
          driver: "mysql"
          host: _HOST_
          port: 3306
          database: temporal_visibility
          user: _USERNAME_
          password: _PASSWORD_

cassandra:
  enabled: false
kafka:
  enabed: true
mysql:
  enabled: true
elastichsearch:
  enabled: true
  external: true
  host: <host>
  scheme: http
  port: 9200
schema:
  setup:
    enabled: false
  update:
    enabled: false

Is the structure of this yaml correct for what i intend? How do i specify kafka topic names? I understand there are 2 kafka topics? One is probably for pushing json data to elastic search, what is the other one for?
Thanks

1 Like

Hi Dhruva, thank you for the question!


Had another query. What exactly is the role of kafka while setting up temporal? From what i understand it is used to publish messages to elastic search. Is that flow being deprecated to directly send data to ES? Is there any other kafka use case? can i run elastic search enabled and kafka disabled using helm?

My understanding is that Kafka is only used for communicating with Elastic Search (and there is some talk around its use for Cross-DC functionality, but that is still under consideration/experimental). And yes, we are looking into deprecate the use of Kafka for interfacing with Elastic Search at some point in the future.

I don’t believe that Elastic Search enabled + Kafka disabled is a supported scenario. You could probably “induce” that configuration by manipulating helm install parameters (--set kafka.enabled=false and --set elasticsearch.enabled=true), but I don’t expect this to actually work, and this is not a configuration we test.


Secondly if i specify an external elastic search in my values.yaml while using helm, it still creates 3 ES pods. Is this expected behavior?

No, this is not what I would expect.

In one of our test pipelines, we install kafka / Elastic Search / Cassandra separately, and then we run this command to install temporal:

helm install temporaltest \
   --set server.replicaCount=5 \
   --set server.kafka.host=kafkat-headless:9092 \
   --set grafana.enabled=false \
   --set kafka.enabled=false \
   --set prometheus.enabled=false \
   -f values/values.cassandra.yaml \
   -f values/values.elasticsearch.yaml \
   -f values/values.dynamic_config.yaml \
  . \
--timeout 15m \
--wait

(this is based on how we describe “bring your own X” scenarios in our readme).

In this configuration, I am not seeing this behavior you described. If you are still seeing the problem, can you share the helm install command line that causes the behavior you are seeing, maybe I could try to repro the problem in my setup.


I want to use our internal hosted mysql, ES and kafka to start temporal values.yaml, currently my config looks like this:
…
Is the structure of this yaml correct for what i intend? How do i specify kafka topic names? I understand there are 2 kafka topics? One is probably for pushing json data to elastic search, what is the other one for?

As far as I can tell the two topics are for ES integration, and the other is for future cross-DC functionality, although I am not 100% sure about the details here. I don’t think there is currently a way to configure topic names via Temporal helm chart. The command line I included above is what we use for running and testing temporal with with a separate hosted cassandra (), ES and kafka.

(in that specific pipeline, we install kafka via helm install kafkat bitnami/kafka --wait --version 7.2.9, and elastic search with helm install --version=7.6.2 --set persistence.enabled=false --set imageTag=6.8.8 elasticsearch elastic/elasticsearch --wait).

And here is the command line that one of our test pipelines uses to point to a running mysql:

helm install \
  --set elasticsearch.enabled=false \
  --set server.replicaCount=5 \
  -f values/values.mysql.yaml \
  --set server.config.persistence.default.sql.user=markmark \
  --set server.config.persistence.default.sql.password=Password123jk \
  --set server.config.persistence.visibility.sql.user=markmark \
  --set server.config.persistence.visibility.sql.password=Password123jk \
  --set server.config.persistence.default.sql.host=mysqldb.temporal.io \
  --set server.config.persistence.visibility.sql.host=mysqldb.temporal.io \
  --set server.config.persistence.default.sql.database=temporal_test1 \
  --set server.config.persistence.visibility.sql.database=temporal_visibility_test1 \
 . \
 --timeout 15m

I hope this helps!

Thank you,
Mark.

1 Like

Kafka is helpful as a flow control mechanism as well as for replay so it would be nice to have it available still as an optional part of the flow to ElasticSearch even if not required in the future (a plus for simpler lower environments).

We are not planning to keep Kafka integration for ES ingestion. We believe that we can provide better availability and flow control without it.