Causes and solutions for DATA_LOSS errors

Ken · July 11, 2025, 11:05pm

Version: v1.13.0

After rolling back the database in the production environment, the following two types of errors began to occur frequently on the client side.

io.grpc.StatusRuntimeException: DATA_LOSS: Incomplete history: expected events [1-4] but got events [1-3] of length 3: isFirstPage=true,isLastPage=true,pageSize=256 at io.grpc.stub.ClientCalls.toStatusRuntimeException
io.grpc.StatusRuntimeException: DATA_LOSS: corrupted history event batch, eventID is not contiguous at io.grpc.stub.ClientCalls.toStatusRuntimeException

Why is these error occurring, and what steps should be taken to resolve it?

I attempted to reproduce the issue in the development environment but were unable to do so.

At the same time, the CPU usage of the database has increased, causing further issues.

maxim · July 12, 2025, 7:45am

Temporal relies on DB being fully consistent. It looks like db rollback left the DB in an inconsistent state. At this point the best option is to recreate DB. If this is not possible you can try deleting specific workflows that broke.

Topic		Replies	Views
Incomplete History because exceed length Community Support go-sdk , general-impl	5	85	September 24, 2024
Got error: corrupted history event batch, eventID is not continuous Community Support java-sdk , cassandra , kubernetes	1	1041	October 20, 2020
Temporal History Server Errors Community Support history	10	1459	September 4, 2024
Internal service error Community Support	7	951	February 25, 2021
Workflow stuck in WorkflowTaskScheduled event (only in one workflow instance, not others) Community Support	8	886	July 31, 2023

Causes and solutions for DATA_LOSS errors

Related topics