Don't retry Activity after Start-To-Close Timeout

doug · October 16, 2023, 9:03pm

Is there a way to prevent Start-To-Close Timeout failures from being retried?

We have a resource-intensive Activity and we don’t want it to run indefinitely. We’ve configured a Start-To-Close Timeout, and also coded our Activity Worker to halt if the Timeout is exceeded. This works as expected; the Activity is halted, but it’s also queued for retry. These retries are undesirable as they tend to produce the same outcome (i.e. they also time out, which leads to retry after retry), wasting resources. Although we want retries in general, we would like to avoid retrying in this particular case.

I’m aware that Retry Policies can have Non-Retryable Errors, but that doesn’t appear to work for Start-To-Close Timeouts, unless I’m missing something?

Are there other solutions we could consider?

Thank you!

maxim · October 17, 2023, 4:50pm

Set activity RetryOptions.maximumAttempts to 1 to disable retries.

Suresh · June 6, 2024, 1:49pm

@maxim wondering if there is a way to retry for certain exceptions and ignore others?

Example: I do a long polling in my activity (python, async activity) talking to external services periodically, I do want to retry if there are network failures in external calls etc but start-to-close timeout etc should be honored and activity should be stopped

What’s the best way to handle this scenario?

maxim · June 6, 2024, 6:01pm

You can specify which exceptions should be retried through the RetryPolicy.non_retryable_error_types. Or you can throw non retryable AppliationFailure from the activity code. StartToClose timeout is always retried up to maxAttempts or ScheduleToClose timeout. I don’t understand the use case for disabling retries for StartToClose timeout.

Suresh · June 7, 2024, 5:03am

Probably I didn’t make things clear, I have something like below

await workflow.execute_activity(
    long_poll_for_some_external_resource,
    "1234",
    schedule_to_close_timeout=timedelta(minutes=120),
    start_to_close_timeout=timedelta(minutes=120),
    heartbeat_timeout=timedelta(seconds=30),
)


@activity.defn
async def long_poll_for_some_external_resource(obj_id: str) -> dict:
    while True:
        activity.heartbeat()
        ready = await check_in_external_service_readiness_for(obj_id)
        if ready:
            return {"is_ready": True}
        await asyncio.sleep(5)
    return {"is_ready": False}

Intention is the long poll can take anywhere between 5 mins to 2 hours, so wanted to wait max of 2 hours, polling for every 5 seconds. I want the activity to be restarted automatically in case of network failures (there are few network calls, talking to few other services, removed them above for brevity), but should honor the schedule_to_close_timeout etc.

does the above example look okay w.r.t the requirements outlined?

I think I may have to set start_to_close_timeout to lower so worker failures etc are detected quickly and activity gets restarted?

maxim · June 7, 2024, 5:34pm

Your code looks fine. Activity will be retried automatically on any failure.

I think I may have to set start_to_close_timeout to lower so worker failures etc are detected quickly and activity gets restarted?

The failure will be detected after 30 seconds due to the heartbeat timeout. BTW if start_to_close is not specified, it defaults to the schedule_to_close. So you don’t need to specify it in this case.

Topic		Replies	Views
Understanding activity retries Community Support go-sdk , activity	1	708	February 7, 2021
How does a retry policy impact the activity timeouts? Community Support retries , activity , timeout	1	786	August 13, 2020
How to debug activity some times does not start at all and timeout with NonRetryableFailure? Community Support	4	619	March 30, 2021
Activity Not Being Called On Retry Community Support java-sdk , activity	12	80	September 6, 2024
Dynamically tune startToClose timeout for activities Community Support java-sdk , general-impl	1	24	May 16, 2025

Don't retry Activity after Start-To-Close Timeout

Related topics