How does the workflow resumption work?

Hello,

I am curious, how does the state of the workflow Java process get saved? Including local variables?

I assumed reflection is used to get fields and for them to be serialized. But how do you serialize local variables?

Is this a feature in Java and GoLang?

It is not really possible to serialize workflow state when it is blocked synchronously on an API call. It also could be prohibitively expensive to ship the whole state on every update in some cases. Another problem with serialization is that in Go, for example the majority of data structures are not serializable if they contain private fields.

So serialization is not used to recover state. Instead, event sourcing is used. All results of external API calls (which must be done only through APIs provided by the Temporal SDK) are recorded in the event history. Then on recovery, the workflow code is replayed from the beginning and all API call results are given from the recorded events. Assuming that the workflow code is deterministic it ends up in exactly the same state as it was before. This way only input and output arguments of activities, child workflows, and other Temporal SDK API calls have to be serializable.

This presentation contains step by step explanation of the process starting from 14:55.

Thanks for that maxim, that cleared that up.

How is blocking implemented? In Java I assume you can do Object.wait(). How do you do this in Golang?

Golang blocks on a channel read.

Sorry to resurrect an old thread. Is it possible to point me to some places in the code where event sourcing is implemented? I am very curious about the timing of everything and how the source or truth is generated/updated. For example, current workflow state. Is the order:

  • external API call
  • event published to queue
  • separate process processes the event which 1) appends a record (row replay) and 2) updates a separate row (current application state)

I would recommend watching this Data@Scale 2017 talk that explains the recovery mechanism. It would be hard to point to a specific place in code as the whole platform works in unison to deliver the seamless recovery experience.

FWIW, here is a more recent talk that explains the recovery mechanism with Temporal terminologies (instead of Cadence): https://youtu.be/6T6zVZHU7_Q?si=UZE--6qAkLs2Smng&t=1558