To Setup SLO for Error Rate, Which metric would be the best?

Himanshu_Garg · February 4, 2026, 9:24am

To Setup SLO for Error Rate, Which metric would be the best - service_errors or service_errors_with_type excluding client error types ?

tihomir · March 1, 2026, 3:04am

service_error_with_type

excluding client error types

which operations are you including in your slo check?

Himanshu_Garg · March 2, 2026, 7:31am

Just a context We have self hosted Temporal Cluster. We want to setup alerts when there is actual server side problem.
As per Temporal OSS code, All the expected and client side behaviours errors are excluded for metric service_errors.

func isExpectedErrorByType(err error) bool {
	// This is not a full list of service errors.
	// Only errors with status code that fails the isExpectedErrorByStatusCode() check
	// but are actually expected need to be explicitly handled here.
	//
	// Some of the errors listed below does not failed the isExpectedErrorByStatusCode() check
	// but are listed nonetheless.
	switch err := err.(type) {
	case *serviceerror.ResourceExhausted:
		return err.Scope == enumspb.RESOURCE_EXHAUSTED_SCOPE_NAMESPACE
	case *serviceerror.Canceled,
		*serviceerror.AlreadyExists,
		*serviceerror.CancellationAlreadyRequested,
		*serviceerror.FailedPrecondition,
		*serviceerror.NamespaceInvalidState,
		*serviceerror.NamespaceNotActive,
		*serviceerror.NamespaceNotFound,
		*serviceerror.NamespaceAlreadyExists,
		*serviceerror.InvalidArgument,
		*serviceerror.WorkflowExecutionAlreadyStarted,
		*serviceerror.WorkflowNotReady,
		*serviceerror.NotFound,
		*serviceerror.QueryFailed,
		*serviceerror.ClientVersionNotSupported,
		*serviceerror.ServerVersionNotSupported,
		*serviceerror.PermissionDenied,
		*serviceerror.NewerBuildExists,
		*serviceerrors.StickyWorkerUnavailable,
		*serviceerrors.TaskAlreadyStarted,
		*serviceerrors.RetryReplication,
		*serviceerrors.SyncState:
		return true
	default:
		return false
	}
}

Then it will better to use this metric service_errors right instead of service_errors_with_type. As it will increase the overhead of maintaining error list to exclude ?

Topic		Replies	Views
Categorizing error emitted from service_error_with_type Community Support general-impl , frontend	7	58	February 16, 2026
Monitoring a self-hosted cluster for uptime (SLO) Community Support	1	855	September 28, 2022
Query around Error metrics for Temporal's internal service(s) Community Support metrics	1	687	February 20, 2023
Why service_errors metric does NOT exclude ShardOwnershipLost? Community Support error-handling	1	11	February 16, 2026
A sub-process with workflow.Sleep() encountered non-deterministic error after Cassandra server migration Community Support go-sdk , cassandra , timeout , server	20	1159	February 8, 2023

To Setup SLO for Error Rate, Which metric would be the best?

Related topics