How to manage a number of different workers running on different taskqueue? Is CRD a suitable solutions?

Background

  1. To controll concurrency, lots of algorigthm services need to run in a separate taskqueue
  2. I’ve already implemenet our own DSL to support the fusion of algorithmic flows
  3. In go-sdk, a worker can only run in one taksqueue
  4. All services run on a k8s cluster

Difficulty

  1. One algorigthm serive per taskqueue.It need to start lots of different workers. It’s too much trouble to manage a dozen configurations and deployment manifests
  2. Similar to the above point, many predefined DSL workflows also need to run on a unique taskqueue

My design

Impletion our CRD controller in k8s.
Combine workflow related configuration and DSL into CRD.
Now, after we submit DSL or activity CRD , the CRD controller automatically create the worker deployment running on the assigned taskqueue.

CRD template:

name: TestWorkflow
workflowOptions:
  taksqueue: test
  concurrency: 3
  replicas: 3
inputs:
  - name: userid
template:
  sequence:
    - activity:
        name: algo0
    - activity:
        name: algo1
        arguments:
          - userid
        result: output1
    - parallel:
        - activity:
            name: algo1
            arguments:
              - output1
            result: output2
        - activity:
            expect:
              op: and
              val:
                - key: output1
                  value: a
                - key: output2
                  value: b
            name: algo1
            arguments:
              - output1
              - output2

Question

  1. Does anyone have a similar problem? How did you solve it?
  2. Is CRD a suitable solutions? Do you have any suggesetions?
  1. To controll concurrency, lots of algorigthm services need to run in a separate taskqueue

Can you give more info on your use case and concurrency needs? What is the rate of execution and what are you trying to limit? A single worker can handle many concurrent workflow executions. Just trying to understand your needs/limitations better.

Most of algorithm services are running on gpu. Take one for example,
a request can take anywhere from 100 milliseconds to 1 minute.
In the old message queue system, the concurrency is limit to 5.
If all serivces sharing the same taskqueue, I can’t limit the concurrency separately.

If you need to guarantee concurrency then use a single task queue.

What was the original reason for using multiple task queues in your case?

The original reason is to maximize the use of these gpu algorithm services.

I don’t know to get there by using a single task queue.

Would you provide more context? I cannot provide any recommendation without understanding what “maximize the use of these gpu algorithm services” means.