Best Practice for Batch Iterator over mssql with error detecting in high scale

Hi,
i have a use case that we want to use temporal for,
the use case is that we have 2B records in MsSql that need to be processed.

i used the Batch Iterator sample as mentioned here : samples-java/core/src/main/java/io/temporal/samples/batch/iterator at main · temporalio/samples-java · GitHub

so I have a BatchWorkflowImpl that uses an Activity RecordLoaderImpl to fetch 200 records from the mssql and move them into Workflow.newChildWorkflowStub with Async.function(processor::processRecord, record).

what is the best practice to handle a failure, if 1 of the 200 parallel record processes?

keys that I want to maintain:

  1. even though one record has failed - I still want to keep processing and batch iterator (until some threshold)
  2. I want to have the ability to retry/replay the failed records. not in one by one.
  3. I want to have a clear monitoring to know how many records have failed.

Thanks