Async child workflow is being called twice

Nitesh_Agarwal · September 23, 2020, 11:47am

I am using demoAsyncChildRun in the cadence java examples. The method gets called twice.
There is a case in which child promise waits indefinitely, and if you stop execution and try creating a new workflow with a new child workflow, the recent changes do not show up. Please find the steps below:
1. Run the HelloChild.java main
2. Thread.sleep for 10 seconds before executing childPromise.get() or use a breakpoint.
3. If the thread waits on childPromise.get() for a long time, stop program execution
4. Update return value of the demoAsyncChildRun method
4. Run HelloChild.java main again, e.g. (“Hello R”)
5. Older value is still returned.
Also, java.lang.IllegalArgumentException’ is always thrown when creating child workflow stub:
Method threw ‘java.lang.IllegalArgumentException’ exception. Cannot evaluate com.sun.proxy.$Proxy162.toString()

On executing the HelloChild again, a child workflow is created with the same workflow id; if we do not provide options with a new workflow id, it fails.

 @Override
 public String getGreeting(String name) {
   return demoAsyncChildRun(name);
 }

 // This example shows how parent workflow return right after starting a child workflow,
 // and let the child run itself.
 private String demoAsyncChildRun(String name) {
   GreetingChild child = Workflow.newChildWorkflowStub(GreetingChild.class);
   // non blocking call that initiated child workflow
   Async.function(child::composeGreeting, "Hello", name);
   // instead of using greeting.get() to block till child complete,
   // sometimes we just want to return parent immediately and keep child running
   Promise<WorkflowExecution> childPromise = Workflow.getWorkflowExecution(child);
   childPromise.get(); // block until child started,
   // otherwise child may not start because parent complete first.
   return "Hello Q";
 }

maxim · September 23, 2020, 3:48pm

I am using demoAsyncChildRun in the cadence java examples. The method gets called twice.

Cadence and Temporal replay workflow code on recovery. So it is OK to see the workflow body code executed multiple times when watching it in a debugger or through print statements. Use Workflow.getLogger to get a logger that dedupes log statements.

2…

The workflow task timeout defaults to 10 seconds. So if you block workflow code execution through sleep (in real code you should always use Workflow.sleep to avoid such issues) or breakpoint the workflow task is time outs and is retried. My guess is that the workflow completes in a different thread by the time you change its code.

The best way to validate this is to look at the workflow execution history through UI or CLI.

I’m not sure what you are trying to achieve here. Is it just for learning how the SDK works?

Also, java.lang.IllegalArgumentException’ is always thrown when creating child workflow stub:
Method threw ‘java.lang.IllegalArgumentException’ exception. Cannot evaluate com.sun.proxy.$Proxy162.toString()

I’ve seen this when evaluating the child stub in a debugger as it doesn’t support toString out of the box. But I don’t understand what you mean by “is always thrown”. Do you have a reproduction?

On executing the HelloChild again, a child workflow is created with the same workflow id; if we do not provide options with a new workflow id, it fails.

Can you reproduce this problem using Temporal? Here are the Temporal Java SDK samples.

Nitesh_Agarwal · September 24, 2020, 7:47am

For the second point: I understood the part where the workflow timeout and completes in a different thread. But, when I stop the execution and try again with another return string in the method: demoAsyncChildRun, the old value is returned. The same thing happens for all subsequent runs without sleep or breakpoint.
Yes, you can observe that in all executions of HelloChild example.
I will try to check with Temporal and respond here.

Nitesh_Agarwal · September 24, 2020, 10:38am

No issues with temporal examples. Thank you I see you have mentioned that temporal is a fork of cadence. Are there some fixes in Temporal that are not in cadence? If yes are those major and will it completely digress from cadence?

maxim · September 24, 2020, 3:44pm

Yes, there are major fixes in Temporal that are not in Cadence. The team behind spent almost a year on the improvements.

Here is a very incomplete list of the differences:

temporal.io is the fork of the Cadence project by the original founders and tech leads of the Cadence project Maxim Fateev and Samar Abbas. We started Temporal Technologies and received VC funding as we believe that the programming model that we pioneered through AWS Simple Workflow, Durable Task Framework and the Cadence project has potential which goes far beyond a single company. Having a commercial entity to drive the project forward is essential for the longevity of the project.

The temporal.io fork has all the features of Cadence as it constantly merges from it. It also implemented multiple new features.

Here are some of the technical differences between Cadence and Temporal as of initial release of the Temporal fork (expected to reach production status at 05/2020)

All thrift structures are replaced by protobuf ones

All public APIs of Cadence rely on Thrift. Thrift object are also stored in DB in serialized form.

Temporal converted all these structures to Protocol Buffers. This includes objects stored in the DB.

Communication protocol switched from TChannel to gRPC

Cadence relies on TChannel which was TCP based multiplexing protocol which was developed at Uber. TChannel has a lot of limitations like not supporting any security and having a very limited number of language bindings. It is essentially deprecated even at Uber.

Temporal uses gRPC for all interprocess communication.

TLS Support

Cadence doesn’t support any communication security as it is a limitation of TChannel.

Temporal has support for mutual TLS and is going to support more advanced authentication and authorization features in the future.

Simplified configuration

Temporal has reworked the service configuration. Some of the most confusing parts of it are removed. For example, the need to configure membership seeds is eliminated. In temporal each host upon startup registers itself with the database and uses the list from the database as the seed list.

Release pipelines

Cadence doesn’t test any publicly released artifacts including docker images as its internal release pipeline is ensuring the quality of the internally built artifacts only. It also doesn’t perform any release testing for dependencies that are not used within Uber. For example, MySQL integration is not tested beyond rather incomplete unit tests. The same applies to the CLI and other components.

Temporal is making heavy investment into the release process. All the artifacts including a full supported matrix of dependencies are going to be subjected through a full release pipeline which is going to include multi-day stress runs.

The other important part of the release process is the ability to generate patches for production issues. The ability to ensure the quality of such patches and produce all the necessary artifacts in a timely manner is important for anyone running Temporal in production.

Payload Metadata

Cadence stores activity inputs and outputs and other payloads as binary blobs without any associated metadata.

Temporal allows associating metadata with every payload. It enables features like dynamically pluggable serialization mechanisms, seamless compression, and encryption.

Failure Propagation

In Cadence activity and workflow, failures are modeled as a single binary payload and a string reason field. Only Java client supports chaining exceptions across workflow and activity boundaries. But this chaining relies on fragile GSON serialization and doesn’t work with other languages.

Temporal activity and workflow failures are modeled as protobufs and can be chained across components implemented in different SDKs. For example, a single failure trace can contain a chain that is caused by an exception that originates in activity written in Python, propagated through Go child workflow up to Java workflow, and later to the client.

Go SDK

Temporal implemented the following improvements over Cadence Go client:

Protobuf & gRPC
No global registration of activity and workflow types
Ability to register activity structure instance with the worker. It greatly simplifies passing external dependencies to the activities.
Workflow and activity interceptors which allow implementing features like configuring timeouts through external config files.
Activity and workflow type names do not include package names. This makes code refactoring without breaking changes much simpler.
Most of the timeouts which were required by Cadence are optional now.

Java SDK

Temporal implemented the following improvements over Cadence Java client:

Workflow and activity annotations to allow activity and workflow implementation objects to implement non-workflow and activity interfaces. This is important to play nice with AOP frameworks like Spring.
Polymorphic workflow and activity interfaces. This allows having a common interface among multiple activity and workflow types.
Dynamic registration of signal and query handlers.
Workflow and activity interceptors which allow implementing features like configuring timeouts through external config files.
Activity and workflow type name generation improved

Topic		Replies	Views
Duplicates in child workflows Community Support java-sdk , cadence	2	2062	August 5, 2020
Child Workflows in pending state Community Support	6	513	February 21, 2024
Child-workflows + Signals Community Support java-sdk	7	3712	October 16, 2021
Workflow signals Operating on same data as WorkflowMethod Community Support java-sdk , signals	5	439	May 7, 2024
Best way to create an async child workflow Community Support child-workflow , async	25	16440	July 12, 2024

Async child workflow is being called twice

Related topics