I have a workflow that pauses while waiting for an asynchronous downstream operation to complete. The downstream system publishes an event to Kafka once the operation is done, and my service listens to this event and triggers a webhook to send a signal to resume the Temporal workflow.
However, there are cases where these downstream events might be lost or not delivered, causing the workflow to remain paused indefinitely and eventually time out.
To handle this, I’m exploring the idea of adding a polling fallback mechanism — so that when the workflow is reset or retried, it can poll the downstream system to verify if the operation has already completed and then resume accordingly.
I wanted to check:
Does Temporal provide any built-in support or recommended pattern for handling such scenarios where external signals might be lost?
If not, what would be the best practice to implement this kind of recovery or polling mechanism within a Temporal workflow?
@maxim I’m planning to implement periodic polling that runs every hour. My idea is to create a child workflow responsible for polling and notifying the parent workflow once it completes (as provided in samples-java repo), based on the downstream state. Do you see any potential concerns or drawbacks with this approach?
The issue: childWorkflow.exec(...) blocks the parent until the child completes, so the parent can’t concurrently react to signals while the child is polling.
Question : what are the recommended patterns in Temporal to run the polling child and wait for a signal in parallel? For example — should I invoke the child asynchronously, have the child send a signal back to the parent on completion, convert the poller into an activity, use promises/async APIs, or something else? Any example snippets, pitfalls, or best practices would be really helpful.
Thanks @antonio.perez . I’m implementing periodic polling as a child workflow, which will be triggered by the parent workflow.
In the official Temporal Java samples, the polling logic is implemented directly inside the workflow code. However, that is not a good practice Polling in workflow vs. Activity? - #2 by maxim.
Since in my design the polling runs inside a child workflow, and the parent workflow only records the start and end events, am I correct in assuming that this approach won’t negatively impact the parent workflow’s history size or performance?
In the official Temporal Java samples, the polling logic is implemented directly inside the workflow code
Could you show me where? note that this is relaying on activity retries , not calling an activity in a loop.
I correct in assuming that this approach won’t negatively impact the parent workflow’s history size or performance?
it will add three event to the workflow history ( StartChildWorkflowExecutionInitiated/Failed.., ChildWorkflowExecutionStarted, ChildWorkflowExecutionCompleted/Failed/Cancelled)
Also let’s say If my parent workflow is running for 2-3 days will child workflow history impact if i’m using it for polling for every 1 hour ??
Sorry, does this answer your question or is this a different one?
it will add three event to the workflow history ( StartChildWorkflowExecutionInitiated/Failed.., ChildWorkflowExecutionStarted, ChildWorkflowExecutionCompleted/Failed/Cancelled)
I have the felling that you don’t need a child workflow, infrequent polling should do what you need
@antonio.perez For infrequent polling, I’ll need to start a new workflow as well, right? I was thinking of using a child workflow for this.
Can this be achieved without creating a new workflow? Are you suggesting that I should directly call an activity from my parent workflow instead? This will IMO as well add events like ActivityTaskScheduled/Completed per retry.
Promise result = Async.function(pollHandlerActivity::poll, req);
Workflow.await(() → isPaused || result.get());
As I understand it, the activity will continue executing asynchronously in the background (including any retries, as per its retry policy). Meanwhile, the workflow will remain paused at the Workflow.await() line until either the activity completes (result.get() == true) or the isPaused flag becomes true.
From my perspective, when the predicate will be evaluated since I have used result.get() it will block until the activity completes. Is there any better way to do this ??
I think result.get() is not what you want here, because get() is a blocking call, it will wait until the activity has completed. I think what you want here is result.isCompleted(), which will tell you if the activity has completed or not.
@awwx When I use result.isCompleted(), if it evaluates to true, that could also mean the activity completed with an exception, not just successfully. In that case, my Workflow.await() condition would still evaluate to true, even though the activity actually failed — which isn’t what I want. So this approach wouldn’t work correctly for my use case.
Now the activity will only complete if the poll determines that the operation has completed, and the activity returns normally without throwing an exception.
If you don’t want to use infinite retries and so need to check whether the activity completed normally or with an exception, I think you’d want to do something like wait for isCompleted() and then check the result; I think just calling get() is going to block your await.