Datadog Tracing

Shannon_Tan · February 23, 2021, 4:56am

We use Datadog for tracing most of our application today. They happen to have good coverage for our libraries here: dd-trace-go/contrib at v1 · DataDog/dd-trace-go · GitHub

But while they say they are OpenTracing compatible, they aren’t in actuality. Datadog spans aren’t linkable to opentracing spans and vice versa. This significantly reduces the benefit of tracing for us. I did leave a comment here to confirm/clarify: question: opentracing.SpanFromContext(request.Context()) returns nil inside a datadog span · Issue #813 · DataDog/dd-trace-go · GitHub

One possible alternative is to use the ContextPropagator to propagate the span context separately from the opentracing one, but I do not know how/where to create the spans. I also don’t see a place to finish the spans.

Maybe workflow interceptors? But there’s no activity interceptor for golang yet.

Any thoughts on the “easiest” way to proceed?

maxim · February 23, 2021, 5:17am

I think the correct answer is to fix the Golang interceptors and add the activity interceptors. Then adding support for custom tracing would be trivial.

max · April 21, 2022, 12:12am

hey @maxim

Trying to understand the work that needs to be done here. If I implemented a version of this for datadog, I’d miss activity traces? sdk-go/tracing_interceptor.go at b1c3a91d252ed4177afaba9f6550dea40c434180 · temporalio/sdk-go · GitHub

maxim · April 21, 2022, 12:13am

I believe we fixed the Golang interceptors already. Let me check.

maxim · April 21, 2022, 12:35am

Yes, I think they have all the needed features.

bergundy · April 21, 2022, 12:56am

We support both opentracing and opentelemetry and is none of those work for you, AFAIK, you can provide your own opentelemetry context propagator (not to be confused with Temporal context propagators) to connect the DataDog spans to Otel spans and back.

Shannon_Tan · April 21, 2022, 12:59am

Yeah, Datadog specifically has some weirdness in their implementation. I’ve personally updated my codebase to use OpenTracing and that addressed by problems even before the interceptors fully worked.

max · April 21, 2022, 1:40am

Got it working, seems to work well with datadog. Here’s a datadog interceptor implementation if someone needs it.

package temporal

import (
	"context"
	"fmt"

	"go.temporal.io/sdk/interceptor"
	"go.temporal.io/sdk/log"
	"gopkg.in/DataDog/dd-trace-go.v1/ddtrace"
	"gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer"
)

type TextMap struct {
	Map map[string]string
}

var _ tracer.TextMapReader = new(TextMap)
var _ tracer.TextMapWriter = new(TextMap)

func newTextMap() *TextMap {
	return &TextMap{Map: map[string]string{}}
}

func (tm *TextMap) Set(key, val string) {
	tm.Map[key] = val
}
func (tm *TextMap) ForeachKey(handler func(key, val string) error) error {
	for k, v := range tm.Map {
		if err := handler(k, v); err != nil {
			return err
		}
	}
	return nil
}

type spanContextKey struct{}

const defaultHeaderKey = "_tracer-data"

type ddTracer struct {
	interceptor.BaseTracer
	options TracerOptions
}

func NewTracer(options TracerOptions) interceptor.Tracer {
	return &ddTracer{options: options}
}

// NewTracingInterceptor creates an interceptor for setting on client options
// that implements Datadog tracing for workflows.
func NewTracingInterceptor(options TracerOptions) interceptor.Interceptor {
	t := NewTracer(options)
	return interceptor.NewTracingInterceptor(t)
}

// TracerOptions are options provided to NewTracingInterceptor or NewTracer.
type TracerOptions struct {

	// DisableSignalTracing can be set to disable signal tracing.
	DisableSignalTracing bool

	// DisableQueryTracing can be set to disable query tracing.
	DisableQueryTracing bool
}

func (t *ddTracer) Options() interceptor.TracerOptions {
	return interceptor.TracerOptions{
		DisableSignalTracing: t.options.DisableSignalTracing,
		DisableQueryTracing:  t.options.DisableQueryTracing,
		SpanContextKey:       spanContextKey{},
		HeaderKey:            defaultHeaderKey,
	}
}

func (t *ddTracer) UnmarshalSpan(m map[string]string) (interceptor.TracerSpanRef, error) {
	textMap := &TextMap{Map: m}
	spanCtx, err := tracer.Extract(textMap)
	if err != nil {
		return nil, err
	}
	return &tracerSpanRef{SpanContext: spanCtx}, nil
}

func (t *ddTracer) MarshalSpan(span interceptor.TracerSpan) (map[string]string, error) {
	textMap := newTextMap()
	if err := tracer.Inject(span.(*tracerSpan).Span.Context(), textMap); err != nil {
		return nil, err
	}
	return textMap.Map, nil
}

func (t *ddTracer) SpanFromContext(ctx context.Context) interceptor.TracerSpan {
	span, found := tracer.SpanFromContext(ctx)
	if !found {
		return nil
	}
	return &tracerSpan{Span: span}
}

func (t *ddTracer) ContextWithSpan(ctx context.Context, span interceptor.TracerSpan) context.Context {
	return tracer.ContextWithSpan(ctx, span.(*tracerSpan).Span)
}

func (t *ddTracer) StartSpan(opts *interceptor.TracerStartSpanOptions) (interceptor.TracerSpan, error) {
	// Create context with parent
	var parent ddtrace.SpanContext
	switch p := opts.Parent.(type) {
	case nil:
		// nil we ignore
	case *tracerSpan:
		parent = p.Span.Context()
	case *tracerSpanRef:
		parent = p.SpanContext
	default:
		return nil, fmt.Errorf("unrecognized parent type %T", p)
	}

	span := tracer.StartSpan(opts.Operation+":"+opts.Name, tracer.ChildOf(parent), tracer.StartTime(opts.Time))

	// Set tags
	for k, v := range opts.Tags {
		span.SetTag(k, v)
	}

	return &tracerSpan{Span: span}, nil
}

func (t *ddTracer) GetLogger(logger log.Logger, ref interceptor.TracerSpanRef) log.Logger {
	var spanCtx ddtrace.SpanContext
	switch p := ref.(type) {
	case *tracerSpan:
		spanCtx = p.Span.Context()
	case *tracerSpanRef:
		spanCtx = p.SpanContext
	default:
		return logger
	}
	return log.With(logger, "dd.trace_id", spanCtx.TraceID(), "dd.span_id", spanCtx.SpanID())
}

type tracerSpanRef struct{ ddtrace.SpanContext }

type tracerSpan struct{ ddtrace.Span }

func (t *tracerSpan) Finish(opts *interceptor.TracerFinishSpanOptions) {
	// Will ignore if error is nil
	t.Span.Finish(tracer.WithError(opts.Error))
}

Use like:

	return client.NewClient(client.Options{
		Interceptors: []interceptor.ClientInterceptor{
			NewTracingInterceptor(TracerOptions{}),
		},
	})

max · April 23, 2022, 1:51am

When there are errors with a workflow (ie: an error with determinism), I do get strange traces that fill up the trace view. Seems like a retry of a workflow invocation is creating strange traces with weird end and start times.

Maybe when the workflow is invoked it uses the start time of the workflow for the trace start time even if it’s starting much later.

henryfindigs · February 1, 2024, 11:29pm

Is there support for tracing in the SDK Python tracing yet?

Topic		Replies	Views
OpenTelemetry tracing linking signals and queries to workflow Community Support go-sdk	1	729	June 6, 2023
Guidance on Logging Temporal Trace Errors with OpenTelemetry & Datadog Community Support go-sdk	6	145	May 28, 2025
Java Tracing Support Community Support java-sdk , tracing	20	2606	April 5, 2022
Trace temporal with opentelemetry bridge Community Support go-sdk , tracing , opentracing	1	2071	October 23, 2021
Open telemetry context propagation Community Support go-sdk	5	3442	January 10, 2023

Datadog Tracing

Related topics