Inconsistent/Elevated SDK Metrics (Python SDK)

emcc · November 12, 2025, 6:47pm

Hi there,

My team is working with Temporal via the Python SDK and is piping SDK metrics into Datadog. We’re trying to set up monitoring for workflow failures via the temporal_workflow_failed metric, but are seeing strange/unexpected behavior. For ~70 failed workflows in the past 12 hours, we are seeing the metric register a count in the neighborhood of 7000.

The workflow itself is fairly simple - 1 activity that retries once, no retries of the workflow itself. The metric is being filtered to this exact workflow_type and operating environment. The SDK documentation reads to me like it should only be registering 1 increment for a workflow reaching failed state. Any ideas that might help us chase down the source of this issue or similar experiences are appreciated!

antonio.perez · November 13, 2025, 12:54pm

Hi,

not sure if this is the issue, but with Datadog you need to configure the OpenTelemetry to use DELTA temporality instead of CUMULATIVE

metrics=OpenTelemetryConfig(....  
     metric_temporality=OpenTelemetryMetricTemporality.DELTA
),

emcc · November 13, 2025, 4:10pm

Thanks for this! I’ll try this out and see if it helps!

emcc · December 2, 2025, 7:42pm

Just following up to confirm that this shift to DELTA temporality cleared up our metrics issues here! Thanks so much for the assistance!

Topic		Replies	Views
Individual workflow metric Community Support metrics	4	1759	March 30, 2022
Info about prometheus metrics Documentation Feedback python-sdk	6	165	February 19, 2025
Missing metrics breakdown when using Coinbase SDK + Temporal Cloud + Datadog integration Community Support ruby-sdk	5	122	July 7, 2025
Temporal_workflow_completed metrics seems not exist anymore (upgrade to 1.17.0 Community Support metrics	3	972	July 4, 2022
Python SDK: opentelemetry metrics not exporting Community Support python-sdk , metrics	1	598	May 31, 2024

Inconsistent/Elevated SDK Metrics (Python SDK)

Related topics