I ran into some issues with my workflow scheduling a temporal activity with a high number of retries (Intended to retry over a day) that continuously violated the start to close timeout, eventually filling up the available worker threads and causing all other activity and workflow tasks to not get assigned to workers. Eventually I had to make a new deployment that tuned the start to close timeout and reset the workflows to the stage before the activity to have them rescheduled with the same timeout. Had the following questions on how to better remediate this kind of issue in the future:
- For new schedules of the activity tune a different startToClose timeout with a dynamic conf. Would this break any determinism constraints for temporal workflows?
activityOptionsBuilder.setStartToCloseTimeout(dynamicConf.getTemporalActivityStartToCloseTimeout())
- For an already scheduled activity, is it possible to modify the start to close timeout to avoid having to reset the activity in order to modify the retry policy?