PayloadConverter/PayloadCodec should be determinstic or not?

In the official typescript SDK, the determinism of a workflow context is promised. I wonder if this rule also applies for PayloadConverter and PayloadCodec.

When I import the crypto module, I will get errors from the SDK.

I want to generate a session key(encrypted by a public key) in the payload converter and place it in metadata so that it can be used to decrypt and decode the payload.

class SensitiveDataPayloadConverter implements PayloadCoverterWithEncoding {
    private encrypt(data: Buffer) {
    
    }
    public toPayload(value: any): Payload {
        let data: Buffer;
        let originalEncoding: string;
        let session_key: string;
        if (value instanceof SensitiveData) {
            if (typeof (value as SensitiveData<any>).value === 'string') {
                [data, session_key] = this.encrypt(Buffer.from(value.value, 'utf-8'));
                originalEncoding = 'plain/text'
            } else
                throw "unsupported"
        }


        return {
            metadata: {
                [METADATA_ENCODING_KEY]: encode('binary/encrypted'),
                privateKeyId: encode("to-be-implemented"),
                sessionKey: encode(session_key),
                originalEncoding: encode(originalEncoding),
            },
            data: data,
        };
    }
}

I implement SensitiveDataPayloadConverter because I want to allow developer to determine which to encrypt.

async function my_activity() {
  return new SensitiveData(db_token);
}

Or, should I encrypt and decrypt messages in activities if I want to do this?

PayloadConverters run inside the Workflow sandbox, and are therefore subject to the same restrictions as Workflow code. PayloadConverters are meant to convert your data (eg. your JavaScript objects) into Protobuf objects; they are essentially serializers/deserializers. The default PayloadConverter simply serialize objects to JSON. With that idea in mind, it should be clear that PayloadConverters are fundamentally deterministic.

PayloadCodecs run outside of the Workflow sandbox, and are therefore not subject to determinism constraints. They are meant to transform Protobuf payload objects before they go to the wire / get written to the Workflow History. Common use cases for that include encryption, compression, moving large payloads to an object store, etc.

Back to your use case, you can easily implement payload encryption from a PayloadCodec. If you want to make encryption optional (eg. only encrypt a payload if the object is an instance of SensitiveData), then you may use both a PayloadConverter and a PayloadCodec, where the converter serialize the object, then adds a specific metadata to indicate either that payload needs to be encrypted; then, the codec look for that metadata, if it is present, then it perform encryption.

Thanks a lot. I came out with the same solution

@jwatkins
What’s your opinion on using OpenPGP for data protection instead of reinventing the wheel with raw crypto operations? it will be easier to integrate with the existing tool if we use OpenPGP.

However, OpenPGP has its own metadata structure, so I feel weird (personally) about putting it in the payload data field.

I have no strong opinion on this, though I have to admit that this sounds like a curious direction to take. Can you explain a bit how you would use OpenPGP in this context, and why do you think it will be easier? In particular, which identities would you be using in your system? A very same payload may need to be simultaneously readable from both a Workflow Worker and a Client, or from a Workflow Worker and an Activity Worker… And PayloadCodec has no way to know where a specific payload is headed to… Now, if you end up having only one key that is shared across all clients and workers, then how is OpenPGP easier than using a symmetric encryption algorithm?

Now, you say that an OpenPGP solution would be easier to integrate with existing tools, which may be a very sufficient reason. I can hardly argue on that specific point.

By the way, if you haven’t done so yet, you should really get a look at our encryption sample.

Leaving OpenPGP’s metadata in the payload data would be perfectly acceptable.

Can you explain a bit how you would use OpenPGP in this context, and why do you think it will be easier?

OpenPGP supports various crypto algorithms and hash functions. it is quite flexible.
One big thing that I choose OpenPGP is key management, which has its own key file format. so that people can use existing tools(gpg) to generate keys. Once the codec server is down, engineers can still use gpg to debug, gpg has its own key ring implemented, engineers don’t need to change any argument when they decrypt messages, simply install the desired key first.

And PayloadCodec has no way to know where a specific payload is headed to

We have dedicated Temporal workers for each team and one API server to trigger workflow(from Slack, or somewhere else). We will put workflows and activities in dedicated queues. Those workers do not share keys, which is very different from the official encryption sample.
With the SensitiveData structure I mentioned, we can have a hint on which key to use. I think this approach also introduces some pitfalls, I am still thinking if it is good to have different key for each team, they will use same temporal UI but with different visibility of sensitive data. I can admit that this is a complex design.

how is OpenPGP easier than using a symmetric encryption algorithm?

We will use asymmetric encryption to encrypt symmetric session key. which is exactly what OpenPGP already has out of the box.

Sounds good.

I just wanted to make sure that you correctly understood the implications of properly using OpenPGP, and not simply assuming that it would be easier. With proper key management, what you describe can indeed result in a very powerful and flexible solution, but this adds up complexity.

1 Like