The Lord of the Loop: in search of the best abstraction for LLM-driven Temporal applications with the simplest possible DX

asuworks · November 3, 2025, 11:33pm

All happy families are alike; each unhappy family is unhappy in its own way.
By changing your point of view, you can completely reduce all this complexity.
Code the happy path.

― Leo Tolstoy, Anna Karenina
― Maxim Fateev, Temporal NYC October meetup
― Temporal’s commandments, The definitive guide to Durable Execution, Temporal

Why am I posting this?

I have a cognitive itch wrapping my head around the usage of 3rd party Agent classes within Temporal context.
I think there is a dangerous mixing of concerns between current integrations of 3rd party Agents into Temporal’s workflow context and Temporal’s core abstractions (Workflow/Activity).

I would like to start a discussion to collectively figure out the best possible abstraction for LLM-driven Temporal applications with the simplest possible developer experience. And to publicly scratch my itch.

TL;DR

Since Temporal workflows must be deterministic, 3rd party Agent SDKs (calling external APIs) must be wrapped in an activity. This forces the developers to write code in which 3rd party Agents are running within a single Temporal activity. This goes against the broader purpose of the Agent abstraction, which is meant to support multiple tools, handoffs, interrupts, guardrails, and other complex behaviors.

In Temporal we already have very strong abstractions: workflow and activity.
In a hunt for the simplest DX, my suggestion is that Agent and Tools should be Temporal workflows.

Below, I provide some examples to support my claim.

Outline

Here’s how we’ll procede:

observe a great agent - jules.google (I am not affiliated with google)
propose some definitions for Agent and Tool based on the observations of jules and Temporal’s core abstractions (Workflow/Activity) in order to define our goal and to avoid “Agent vs Workflow” controversy prevalent in the ai dev scene.
briefly discuss existing attempts to marry Temporal and 3rd party agent SDKs:
1. Temporal’s OpenAIPlugin and community examples using it:
  1. that are not Agents in the sense of the definition proposed in step 2, but workflows that use LLMs.
  2. that dangerously use openai’s Agent abstraction as a thin wrapper for an LLM call (thus creating my cognitive itch)
2. PydanticAIPlugin - “adding durability” to the PydanticAI’s Agent abstraction
3. LangGraph Agent within a Temporal activity
state my problem with existing attempts
provide community examples of Temporal-native Agents
mention alternative paths for building agentic orchestrators
finish with some closing thoughts and open questions

Before we dive in, it’s worth noting that no abstraction is universally best. Abstractions are just reference points; some improve developer experience, while others make it harder. Here is a great image @maxim used in his talk Temporal NYC October meetup

Image sources:

Self-pity-excuse disclaimer:
I do not have much experience with OpenAI agents sdk or PydanticAI or any other agentic framework, and would highly appreciate your thoughts on how you use them (or don’t) in combination with Temporal.

1. Observation of a great agent

I like the interactive experience with Jules (I am not affiliated with google)

Jules is long-running
Jules is stateful
- Knowledge (memories)
- Environment
- Order of execution of tools is maintained (plan)
Jules is interactive and responsive:
- asks clarification questions when it gets stuck
- allows to inject new messages into running tools: while Jules is working you can send adjustments to it and it will self-correct.

What is great about Jules? (my fantasy about the implementation of Jules)

Agent state is durable
Agent tool execution is durable
Agent tools have arbitrary execution time → Agent loop has arbitrary execution time (multiple hours at least)
Agent tools can recieve new information during execution
There is HITL (maybe not at tool level, but at Agent level at least)

2. Some definitions: ReACT Loop, Agent, Tools, Workflow

ReACT (Reason and Act) Loop

# General logic, no LLMs here
while not goal_reached:
    observation = observe_environment(state)
    action = reason_and_decide(observation)
    result = act(action)
    update_state(result)
    observation = observe_environment(state)
	goal_reached = is_goal_reached(observation)

# with LLMs - same logic, just with LLM specific implementation:
while not goal_reached:
    observation = observe_environment(state) # might call an LLM
	action = reason_and_decide(observation) # calls LLM, returns action (list of tool calls) 
    result = act(action) # exectutes tool calls
    update_state(result) # use the result to make something useful (beautiful)
    observation = observe_environment(state) # might call an LLM
	goal_reached = is_goal_reached(observation) # might call an LLM

Tools are computer programs that do something useful (hopefully), or at least beautiful.

Agent is a stateful computer program with access to Tools (at least 1) and autonomy (exercised by an LLM call) to decide which tools to use or whether to use them at all when given a task.

Agent's state:
- instructions
- message history from current conversation/thread
- … anything else your Agent might need to do it’s thing: observe, reason_and_decide, act

Agents can receive input in form of a message (user’s query).

Agent's job is to address/solve/act upon the message by using it’s tools.

The decision which tools to use is offloaded to an LLM (autonomy) by making a simple completion call like this or this or this to an LLM with a prompt composed of:
- instructions
- previous tool’s output
- [optional] some form of agent’s state (context engineering)

An Agent with no tools or a single tool and tool_choice: "required" is not an Agent, but an LLM call.

Workflow - deterministic computer program = Temporal workflow (we’re in a Temporal forum…)

Activity - non-deterministic computer program = Temporal activity (just for completeness)

In this post, when I say workflow or activity without qualification, I mean Temporal workflow and Temporal activity (most of the time…)

3. Attempts to marry 3-rd party Agents into Temporal’s workflow context (“add durability”)

OpenAIPlugin

allows to use openai’s Agent class and execute temporal activities as tools or execute temporal workflows as MCP tools

python-samples/openai_agents using OpenAIPlugin

OpenAI Agent class is used mostly:
- as a thin wrapper for simple LLM call: Agent(tools=[only_one_tool], model_settings=ModelSettings(tool_choice="required")) (example)
- to call OpenAI-hosted tools like WebsearchTool(), etc..

other community “Agents”

Josh’s RepairAgentWorkflow
Steve’s interactive research agent

are not really Agents (according to definition above), but Workflows (that use LLMs) because:

“Agents” have exactly one tool (there is an exception though, but this is not the main point)
examples use ModelSettings(tool_choice="required")
there is no ReACT loop in any of them (technically, there is, but it is inside the OpenAI Agents SDK - this is the main point)

PydanticAIPlugin

import uuid

from temporalio import workflow
from temporalio.client import Client
from temporalio.worker import Worker

from pydantic_ai import Agent
from pydantic_ai.durable_exec.temporal import (
    AgentPlugin,
    PydanticAIPlugin,
    TemporalAgent,
)

agent = Agent(
    'gpt-5',
    instructions="You're an expert in geography.",
    name='geography',  
)

temporal_agent = TemporalAgent(agent)  


@workflow.defn
class GeographyWorkflow:  
    @workflow.run
    async def run(self, prompt: str) -> str:
        result = await temporal_agent.run(prompt)  
        return result.output


async def main():
    client = await Client.connect(  
        'localhost:7233',  
        plugins=[PydanticAIPlugin()],  
    )

    async with Worker(  
        client,
        task_queue='geography',
        workflows=[GeographyWorkflow],
        plugins=[AgentPlugin(temporal_agent)],  
    ):
        output = await client.execute_workflow(  
            GeographyWorkflow.run,
            args=['What is the capital of Mexico?'],
            id=f'geography-{uuid.uuid4()}',
            task_queue='geography',
        )
        print(output)
        #> Mexico City (Ciudad de México, CDMX)

Inheritance chain in Pydantic AI: TemporalAgent → WrapperAgent → AbstractAgent
Again, the Agent Loop is inside pydantic ai SDK.

LangGraph Agent within a Temporal Activity

Michael Toscano created a beautiful example of using the LangGraph Agent (Agent Loop) within a Temporal activity inside a Temporal workflow - AgentWorkflow
And again, the Agent Loop is inside LangGraph library.

4. Problem statement: Who is the Lord of the Loop?

PydanticAIPlugin, OpenAIPlugin allow to execute temporal activities as tools or temporal workflows as MCP servers, but the control of the Agent Loop lies within the libraries: pydantic-ai, agents (openai sdk).

Let’s take a step back and look again at Jules:

Observation:
Agent - is a process that is stateful and long-running in nature.

Un-confirmed claim (would be really nice to have):
Tools - are long-running, stateful processes (would be nice for a Tool to be able to accept new messages while it is running and decide on it’s own if it should cancel running tasks or append the new task to it’s internal todo list). Basically, what I mean by stateful Tools is Agent as Tool.

Example: a Jarvis-like Agent with a book_flight tool which would be able to workflow.sleep(flight.day-1) and then call check_in(flight). The Agent should be able to keep track of it’s running tools and forward messages to them.

For all Temporal users this should ring a bell:
Both Agent (Agent Loop) and Tools should be Temporal workflows.

Main postulate of a Temporal Workflow is that it is deterministic.
Is the Agent Loop deterministic?

The Agent Loop (ReACT Loop):

while not goal_reached:
    observation = observe_environment(state) # might call an LLM
	action = reason_and_decide(observation) # calls LLM, returns action (list of tool calls) 
    result = act(action) # exectutes tool calls
    update_state(result) # use the result to make something useful (beautiful)
    observation = observe_environment(state) # might call an LLM
	goal_reached = is_goal_reached(observation) # might call an LLM

As long as all methods in the loop are Temporal activities or Temporal workflows (child workflows), the Agent Loop is deterministic.

At this point, I would like to mention (again) Mike Toscano’s great example of using a LangGraph Agent as a Temporal activity inside a Temporal workflow - AgentWorkflow

The problem: When using 3rd party agents within a Temporal activity, developers are forced to make those agents extremely narrow in scope—with only a few tools available. In practice, this means using them mostly as thin wrappers around a simple LLM call. This goes against the broader purpose of the Agent abstraction, which is meant to support multiple tools, handoffs, interrupts, guardrails, and other complex behaviors.

While “keeping agents focused” sounds like a good design principle, it prevents more general “Agent-as-Tool” scenarios. To support those, Tools themselves should be implemented as Temporal workflows.

Possible solution:
If we want our Tools to be Temporal workflows, it would make sense to maintain parent-child relationship (between the Agent workflow and the Tool workflow).

If this is so, the Agent Loop must be inside the Agent workflow (within Temporal workflow, not in a 3rd party library).

Conclusion
Agent should be a Temporal workflow → AgentWorkflow.
Tools should be Temporal workflows (generally) → ToolWorkflow.
=> If we want functionality and flexibility (like Jules), Temporal should be the Lord of the Loop.

5. Temporal-native agents: steps into the Road

Examples of Agent Loop within Temporal workflow:

6. Alternative agent orchestrations

It’s probably worthwhile to look into existing agentic frameworks to learn patterns of multi-agent orchestration.
These patterns could be then used to create an AgentOrchestratorWorkflow - Temporal workflow (like rojak does it) that would coordinate the individual AgentWorkflows (Temporal workflows)

https://www.confluent.io/blog/event-driven-multi-agent-systems/

Here is an example of the Blackboard Pattern used by flock (they ditched Temporal in v0.5.0):

https://github.com/whiteducksoftware/flock/releases/tag/0.5.0

Probably, it would be nice to cite some ideas from the Anthropic’s engineering blog or some other smart places, but I hope someone can chime in.

7. Closing Thoughts

Instead of creating plugins for all possible agent frameworks out there, I think the Temporal AI team should honor their roots (core abstractions: workflow and activity) and proudly focus on creating their own agent library (naming suggestion: temporalio.contrib.lordoftheloop :)).

The library would abstract LLM calls (like the OpenAIPlugin does, but with litellm or any-llm) and maybe add some base classes or decorators to enhance DX for event emission, using event types that are emerging as standards (like ag-ui or something else) and would allow for deeply nested agents like deepagents. MCP and A2A wrappers would be also great.

Can you, better developers than me, think of something good, please?

Thanks for being witnesses of my public itch scratching.

Topic		Replies	Views
Q&A from Deep-Dive: AI Agent Code Walkthrough webinar Developer Corner	1	544	May 6, 2025
Agent Builder layer over Temporal Community Support	0	24	January 15, 2026
Q&A from Learn How to Build AI Agents with Temporal webinar Developer Corner ai	1	850	April 7, 2025
Webinar: Deep-Dive: AI Agent Code Walkthrough with Temporal Announcements webinar	3	231	May 6, 2025
Webinar: Learn How to Build AI Agents with Temporal Announcements webinar , ai	3	317	April 7, 2025