History size exceeds warn limit - scheduled workflow

I started receiving the “history size exceeds warn limit.” warning from a temporal scheduled workflow:

{"level":"warn","ts":"2023-12-19T17:59:00.458Z","msg":"history size exceeds warn limit.","service":"history","shard-id":1280,"address":"10.0.0.121:7234","wf-namespace-id":"79084a7e-3106-4464-9ce9-64132a3a0168","wf-id":"temporal-sys-scheduler:agent_heartbeat","wf-run-id":"2f6f18ed-8de7-490a-b73d-81e6db37f7ed","wf-history-size":4480999,"wf-event-count":14300,"logging-call-at":"context.go:880"}

It’s my understanding that ContinueAsNew gets used automatically after 500 actions for scheduled workflows and even though I do see from the temporal logs that ContinueAsNew started being used, I still ended up with this warning.

Is there a recommended approach for handling this?

  1. increase warning and max threshold?
  2. reduce number of actions before Scheduled workflows switches to ContinueAsNew?
  3. other?

Also is there prometheus stat where the history size is tracked?

Hi @davidn

you need to invoke ContinueAsNew from your workflow code explicitly, ideally before the workflow history reaches 10k events.

Can I ask which sdk do you use?

The warn is emmited here temporal/service/history/workflowExecutionContext.go at v1.7.0 · temporalio/temporal · GitHub , you can check other limits here Self-hosted Temporal Cluster defaults | Temporal Documentation

Regards,
Antonio

Hi Antonio,
Thanks for the reply. This is a Temporal Schedules workflow so I do not have control over how the workflow is invoked.

I’m using the java sdk if that helps, but I am thinking the solution will be from within Temporal itself since Temporal Schedules is the one invoking the workflow.

You can just ignore the warning. In newer versions of the server it uses the continue-as-new-suggested hint so it shouldn’t get to the warn limit.

Although it’s unusual to get that high… are you doing lots of UpdateSchedule or PatchSchedule calls?

Thanks!

Do you know what version continue-as-new-suggested is set? I’m running Temporal server 1.22.2.

This schedule is running every minute, I’m using it to execute a heartbeat. The schedule itself is never updated after it is created (no calls to either UpdateSchedule or PatchSchedule)

It’s in server 1.22.3 (and will be in 1.23.0).

If you’re willing to share the history, I wouldn’t mind taking a look to see if there’s a bug. You can send it in a DM on slack (Slack)