Activity response streaming

I am using Temporal to implement an LLM-based agent. I have an activity which makes an LLM call. Within the activity, I can stream the text response from the LLM.

My goal is to show this streaming response in a custom UI.

My question is: how can I stream this response outside the activity? Either to the workflow state so it can be returned via Query (which I could call in a websocket handler outside Temporal), or directly somewhere else so it can be shown in my UI (like could I connect a websocket directly to the activity)?

I’m new to Temporal, so I’m hoping someone can help me brainstorm the right architecture for this!

While workflow code needs to be written with various restrictions so that the workflow execution is deterministic, activities are free to make full use of all of your programming language features and libraries.

Thus you can have an activity which runs for the duration of the LLM call, and streams the response directly to your UI.

Conceptually, one way to think about workflows is that they’re good at making decisions and ensuring that activities happen in the presence of failures. “What do we need to do next? We need to make the LLM call and stream the response to the client.”

While Temporal supports millions of concurrently executing workflows, each individual workflow execution can only handle a limited number of events per second. Thus in general you don’t want to route voluminous data through the workflow if you don’t need to. (If there’s some portion of that data which the workflow needs to make its next decision, the activity can extract it for the workflow.)