All happy families are alike; each unhappy family is unhappy in its own way.
By changing your point of view, you can completely reduce all this complexity.
Code the happy path.― Leo Tolstoy, Anna Karenina
― Maxim Fateev, Temporal NYC October meetup
― Temporal’s commandments, The definitive guide to Durable Execution, Temporal
Why am I posting this?
I have a cognitive itch wrapping my head around the usage of 3rd party Agent classes within Temporal context.
I think there is a dangerous mixing of concerns between current integrations of 3rd party Agents into Temporal’s workflow context and Temporal’s core abstractions (Workflow/Activity).
I would like to start a discussion to collectively figure out the best possible abstraction for LLM-driven Temporal applications with the simplest possible developer experience. And to publicly scratch my itch.
TL;DR
Since Temporal workflows must be deterministic, 3rd party Agent SDKs (calling external APIs) must be wrapped in an activity. This forces the developers to write code in which 3rd party Agents are running within a single Temporal activity. This goes against the broader purpose of the Agent abstraction, which is meant to support multiple tools, handoffs, interrupts, guardrails, and other complex behaviors.
In Temporal we already have very strong abstractions: workflow and activity.
In a hunt for the simplest DX, my suggestion is that Agent and Tools should be Temporal workflows.
Below, I provide some examples to support my claim.
Outline
Here’s how we’ll procede:
- observe a great agent - jules.google (I am not affiliated with google)
- propose some definitions for
AgentandToolbased on the observations of jules and Temporal’s core abstractions (Workflow/Activity) in order to define our goal and to avoid “Agent vs Workflow” controversy prevalent in the ai dev scene. - briefly discuss existing attempts to marry Temporal and 3rd party agent SDKs:
- Temporal’s OpenAIPlugin and community examples using it:
1. that are notAgentsin the sense of the definition proposed in step 2, but workflows that use LLMs.
2. that dangerously use openai’s Agent abstraction as a thin wrapper for an LLM call (thus creating my cognitive itch) - PydanticAIPlugin - “adding durability” to the PydanticAI’s Agent abstraction
- LangGraph Agent within a Temporal activity
- Temporal’s OpenAIPlugin and community examples using it:
- state my problem with existing attempts
- provide community examples of Temporal-native Agents
- mention alternative paths for building agentic orchestrators
- finish with some closing thoughts and open questions
Before we dive in, it’s worth noting that no abstraction is universally best. Abstractions are just reference points; some improve developer experience, while others make it harder. Here is a great image @maxim used in his talk Temporal NYC October meetup
Image sources:
Self-pity-excuse disclaimer:
I do not have much experience with OpenAI agents sdk or PydanticAI or any other agentic framework, and would highly appreciate your thoughts on how you use them (or don’t) in combination with Temporal.
1. Observation of a great agent
I like the interactive experience with Jules (I am not affiliated with google)
- Jules is long-running
- Jules is stateful
- Knowledge (memories)
- Environment
- Order of execution of tools is maintained (plan)
- Jules is interactive and responsive:
- asks clarification questions when it gets stuck
- allows to inject new messages into running tools: while Jules is working you can send adjustments to it and it will self-correct.
What is great about Jules? (my fantasy about the implementation of Jules)
- Agent state is durable
- Agent tool execution is durable
- Agent tools have arbitrary execution time → Agent loop has arbitrary execution time (multiple hours at least)
- Agent tools can recieve new information during execution
- There is HITL (maybe not at tool level, but at Agent level at least)
2. Some definitions: ReACT Loop, Agent, Tools, Workflow
ReACT (Reason and Act) Loop
# General logic, no LLMs here
while not goal_reached:
observation = observe_environment(state)
action = reason_and_decide(observation)
result = act(action)
update_state(result)
observation = observe_environment(state)
goal_reached = is_goal_reached(observation)
# with LLMs - same logic, just with LLM specific implementation:
while not goal_reached:
observation = observe_environment(state) # might call an LLM
action = reason_and_decide(observation) # calls LLM, returns action (list of tool calls)
result = act(action) # exectutes tool calls
update_state(result) # use the result to make something useful (beautiful)
observation = observe_environment(state) # might call an LLM
goal_reached = is_goal_reached(observation) # might call an LLM
Toolsare computer programs that do something useful (hopefully), or at least beautiful.
Agentis astatefulcomputer program with access toTools(at least 1) andautonomy(exercised by an LLM call) to decide which tools to use or whether to use them at all when given a task.
Agent'sstate:
- instructions
- message history from current conversation/thread
- … anything else yourAgentmight need to do it’s thing:observe,reason_and_decide,actAgentscan receive input in form of a message (user’s query).Agent'sjob is to address/solve/act upon the message by using it’s tools.- The decision which tools to use is offloaded to an LLM (autonomy) by making a simple completion call like this or this or this to an LLM with a prompt composed of:
- instructions
- previous tool’s output
- [optional] some form of agent’s state (context engineering)An Agent with no tools or a single tool and
tool_choice: "required"is not anAgent, but an LLM call.
Workflow- deterministic computer program = Temporal workflow (we’re in a Temporal forum…)
Activity- non-deterministic computer program = Temporal activity (just for completeness)
In this post, when I say workflow or activity without qualification, I mean Temporal workflow and Temporal activity (most of the time…)
3. Attempts to marry 3-rd party Agents into Temporal’s workflow context (“add durability”)
OpenAIPlugin
- allows to use openai’s Agent class and execute temporal activities as tools or execute temporal workflows as MCP tools
python-samples/openai_agents using OpenAIPlugin
- OpenAI Agent class is used mostly:
- as a thin wrapper for simple LLM call:
Agent(tools=[only_one_tool], model_settings=ModelSettings(tool_choice="required"))(example) - to call OpenAI-hosted tools like WebsearchTool(), etc..
- as a thin wrapper for simple LLM call:
other community “Agents”
- Josh’s RepairAgentWorkflow
- Steve’s interactive research agent
are not really Agents (according to definition above), but Workflows (that use LLMs) because:
- “Agents” have exactly one tool (there is an exception though, but this is not the main point)
- examples use
ModelSettings(tool_choice="required") - there is no
ReACT loopin any of them (technically, there is, but it is inside the OpenAI Agents SDK - this is the main point)
PydanticAIPlugin
import uuid
from temporalio import workflow
from temporalio.client import Client
from temporalio.worker import Worker
from pydantic_ai import Agent
from pydantic_ai.durable_exec.temporal import (
AgentPlugin,
PydanticAIPlugin,
TemporalAgent,
)
agent = Agent(
'gpt-5',
instructions="You're an expert in geography.",
name='geography',
)
temporal_agent = TemporalAgent(agent)
@workflow.defn
class GeographyWorkflow:
@workflow.run
async def run(self, prompt: str) -> str:
result = await temporal_agent.run(prompt)
return result.output
async def main():
client = await Client.connect(
'localhost:7233',
plugins=[PydanticAIPlugin()],
)
async with Worker(
client,
task_queue='geography',
workflows=[GeographyWorkflow],
plugins=[AgentPlugin(temporal_agent)],
):
output = await client.execute_workflow(
GeographyWorkflow.run,
args=['What is the capital of Mexico?'],
id=f'geography-{uuid.uuid4()}',
task_queue='geography',
)
print(output)
#> Mexico City (Ciudad de México, CDMX)
Inheritance chain in Pydantic AI: TemporalAgent → WrapperAgent → AbstractAgent
Again, the Agent Loop is inside pydantic ai SDK.
LangGraph Agent within a Temporal Activity
- Michael Toscano created a beautiful example of using the LangGraph Agent (
Agent Loop) within a Temporal activity inside a Temporal workflow - AgentWorkflow
And again, theAgent Loopis inside LangGraph library.
4. Problem statement: Who is the Lord of the Loop?
PydanticAIPlugin, OpenAIPlugin allow to execute temporal activities as tools or temporal workflows as MCP servers, but the control of the Agent Loop lies within the libraries: pydantic-ai, agents (openai sdk).
Let’s take a step back and look again at Jules:
Observation:
Agent - is a process that is stateful and long-running in nature.
Un-confirmed claim (would be really nice to have):
Tools - are long-running, stateful processes (would be nice for a Tool to be able to accept new messages while it is running and decide on it’s own if it should cancel running tasks or append the new task to it’s internal todo list). Basically, what I mean by stateful Tools is Agent as Tool.
Example: a Jarvis-like
Agentwith abook_flighttool which would be able toworkflow.sleep(flight.day-1)and then callcheck_in(flight). TheAgentshould be able to keep track of it’s running tools and forward messages to them.
For all Temporal users this should ring a bell:
Both Agent (Agent Loop) and Tools should be Temporal workflows.
Main postulate of a Temporal Workflow is that it is deterministic.
Is the Agent Loop deterministic?
The Agent Loop (ReACT Loop):
while not goal_reached:
observation = observe_environment(state) # might call an LLM
action = reason_and_decide(observation) # calls LLM, returns action (list of tool calls)
result = act(action) # exectutes tool calls
update_state(result) # use the result to make something useful (beautiful)
observation = observe_environment(state) # might call an LLM
goal_reached = is_goal_reached(observation) # might call an LLM
As long as all methods in the loop are Temporal activities or Temporal workflows (child workflows), the Agent Loop is deterministic.
At this point, I would like to mention (again) Mike Toscano’s great example of using a LangGraph Agent as a Temporal activity inside a Temporal workflow - AgentWorkflow
The problem: When using 3rd party agents within a Temporal activity, developers are forced to make those agents extremely narrow in scope—with only a few tools available. In practice, this means using them mostly as thin wrappers around a simple LLM call. This goes against the broader purpose of the Agent abstraction, which is meant to support multiple tools, handoffs, interrupts, guardrails, and other complex behaviors.
While “keeping agents focused” sounds like a good design principle, it prevents more general “Agent-as-Tool” scenarios. To support those, Tools themselves should be implemented as Temporal workflows.
Possible solution:
If we want our Tools to be Temporal workflows, it would make sense to maintain parent-child relationship (between the Agent workflow and the Tool workflow).
If this is so, the Agent Loop must be inside the Agent workflow (within Temporal workflow, not in a 3rd party library).
Conclusion
Agent should be a Temporal workflow → AgentWorkflow.
Tools should be Temporal workflows (generally) → ToolWorkflow.
=> If we want functionality and flexibility (like Jules), Temporal should be the Lord of the Loop.
5. Temporal-native agents: steps into the Road
Examples of Agent Loop within Temporal workflow:
- GitHub - Frederic-Zhou/temporal_agent_workflow
- GitHub - StreetLamb/rojak: Python library for building durable and scalable multi-agent orchestrations.
- GitHub - temporal-community/temporal-ai-agent: This demo shows a multi-turn conversation with an AI agent running inside a Temporal workflow.
6. Alternative agent orchestrations
It’s probably worthwhile to look into existing agentic frameworks to learn patterns of multi-agent orchestration.
These patterns could be then used to create an AgentOrchestratorWorkflow - Temporal workflow (like rojak does it) that would coordinate the individual AgentWorkflows (Temporal workflows)
https://www.confluent.io/blog/event-driven-multi-agent-systems/
Here is an example of the Blackboard Pattern used by flock (they ditched Temporal in v0.5.0):
Probably, it would be nice to cite some ideas from the Anthropic’s engineering blog or some other smart places, but I hope someone can chime in.
7. Closing Thoughts
Instead of creating plugins for all possible agent frameworks out there, I think the Temporal AI team should honor their roots (core abstractions: workflow and activity) and proudly focus on creating their own agent library (naming suggestion: temporalio.contrib.lordoftheloop :)).
The library would abstract LLM calls (like the OpenAIPlugin does, but with litellm or any-llm) and maybe add some base classes or decorators to enhance DX for event emission, using event types that are emerging as standards (like ag-ui or something else) and would allow for deeply nested agents like deepagents. MCP and A2A wrappers would be also great.
Can you, better developers than me, think of something good, please?
Thanks for being witnesses of my public itch scratching.
Related Posts:
