Got error: corrupted history event batch, eventID is not continuous

Sergey_Samoylenko · October 20, 2020, 7:52am

We have Temporal 1.0.0 installed in the k8s from the official helm chart. Temporal works fine until at some moment workflow processing gets stuck and we see numerous error in clients’ logs:

"Failure in thread Workflow Poller taskQueue=“indexing-orchestrator”, namespace=“default”: 2
io.grpc.StatusRuntimeException: INTERNAL: corrupted history event batch, eventID is not continuous
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156)
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.pollWorkflowTaskQueue(WorkflowServiceGrpc.java:2658)
at io.temporal.internal.worker.WorkflowPollTask.poll(WorkflowPollTask.java:77)

We had this error since in 0.28 and considered this either as instability of the alpha version or failure caused by Cassandra nodes rotation.

But currently we have this happening on 1.0.0 and a 6-node Cassandra cluster, none of which had restarted to cause this.

derek · October 20, 2020, 12:01pm

How is your cassandra deployed? Can you provide some details on your configuration? Are you using a Replication Factor of 3 temporal’s keyspace? How many racks in your cassandra cluster?

Other details of your configuration could help here as well.

Topic		Replies	Views
Errors on Temporal History Server Community Support history	3	548	July 4, 2023
Errors in temporal history and matching service logs Community Support cassandra , deployment	2	1195	July 7, 2022
Temporal production deployment stopped working Community Support java-sdk , helm	7	973	January 15, 2023
Crash loop of history service in K8s cluster Community Support history , kubernetes	19	3678	April 30, 2021
Temporal service fails to start in k8s: 'Failed to get current schema version from cassandra' Community Support	2	2203	October 20, 2020

Got error: corrupted history event batch, eventID is not continuous

Related topics