Fix Workflow level OTLP tracing with Python workers

Hi,

I am having an issue with open telemetry tracing within Python workers. The only traces being captured are RunActivity type. There are no RunWorkflow traces that act as parent traces of RunActivity or StartActivity traces.

I followed these instructions to set up instrumentation - Instrumentation | OpenTelemetry

And this to configure temporal to emit OTLP traces using the above instrumentation - Observability - Python SDK | Temporal Platform Documentation

Here is my tracer intrumentation logic

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

And here is my temporal config to intercept and emit traces

from temporalio.contrib.opentelemetry import TracingInterceptor

Client.connect(
     config.temporal_address,
     namespace="default",
     tls=False,
     interceptors=[TracingInterceptor()],
)

The result is emitted traces that look like this. Note that they are all RunActivity traces, there are no RunWorkflow traces or StartActivity traces.

I have also configured this for Java workers. The configuration was very similar. The same things were created and set, just using java libraries and paradigms.

The result is java traces that include RunWorkflow, RunActivity, and StartActivity traces. The RunWorkflow traces are acting as parents of StartActivity. This is the behaviour that I want from my Python workers.

How can I get my python workers to emit this RunWorkflow and StartActivity level tracing in the same way that java does? Is this possible at all or is there some limitation for python workers?

I’ve looked through the temporal observability documentation and have not found any other relevant configuration or information about this.

Here’s a screenshot showing the java trace output, including RunWorkflow and StartActivity traces.

And here’s a screenshot of my golang worker trace output. Even better than the java output, all traces are under the RunWorkflow parent.

By default in the Python SDK, unlike the Java SDK, we do not emit spans for workflows if the client was not started with a span. The client span would be StartWorkflow and it is not emitted if the workflow was started from CLI, schedule, or client without that interceptor. We do this because there is a risk of orphaned spans on replay (because there is no stable span created client side to be the overarching parent).

As part of this PR, we added an option to that interceptor called always_create_workflow_spans that you can set to True to override this behavior and always create spans workflow side even if there is not one client side. This was released as part of v1.11.0 last week.

1 Like

Thank you very much for the quick reply. I have enabled that interceptor parameter and will test it out.

Tested, that helped. Thanks.