Hi there, I have an asynchronous python worker working - calling a very simple pytorch python script and running it.
However, I’ve also provided a synchronous worker. I was able to get my synchronous worker working eventually by bypassing the sandbox validation on the import of my python script which has a dependency on pytorch.
While setting up the synchronous worker - I noticed two things:
-
It seems that in the case of a synchronous worker - where you provide “activity_executor” (AKA runner) - the sandbox validation step is triggered. Is it intended that this validation only occurs for synchronous workers that provide an activity_executor? I was a bit surprised that I didn’t run into this issue in either of the asynchronous workers I created (which import the same python script with a dependency on pytorch)
-
The sandbox validation code seems to fail on circular dependencies when it encounters the same instance of a docstring - I presume the validation code just needs to deal with cycles in dependencies or is this “as designed”?
I’m not certain why this error is happening - possibly the result of some kind of a circular dependency in pytorch - but as I say - I’m also not certain why this only crops up in the synchronous worker. It appears to be due to the following validation step:
class _WorkflowWorker:
def __init__(
self,
[SNIP]
) -> None:
[SNIP]
# Prepare the workflow with the runner (this will error in the
# sandbox if an import fails somehow)
try:
if defn.sandboxed:
workflow_runner.prepare_workflow(defn)
else:
unsandboxed_workflow_runner.prepare_workflow(defn)
except Exception as err:
raise RuntimeError(f"Failed validating workflow {defn.name}") from err
self._workflows[defn.name] = defn
As I say - I’m able to execute the same python script from within my asynchronous temporal worker - the difference appears to be that the synchronous worker needs to provide an “activity_executor” (AKA runner). I did try testing whether I could repro the same error in my asynchronous worker by unnecessarily providing an activity_executor - but I wasn’t able to get the same error. The callstack that points to the above code is here:
Traceback (most recent call last):
File "/workspaces/go-run-ml/python/run_ml_worker_manager2.py", line 126, in <module>
asyncio.run(main2())
File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/workspaces/go-run-ml/python/run_ml_worker_manager2.py", line 98, in main2
worker = Worker(
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/_worker.py", line 263, in __init__
self._workflow_worker = _WorkflowWorker(
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/_workflow.py", line 112, in __init__
raise RuntimeError(f"Failed validating workflow {defn.name}") from err
RuntimeError: Failed validating workflow MachineLearningWorkflow
The above validation callstack (I presume the code is traversing all imports to perform its validation) results in the following error callstack:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/_workflow.py", line 108, in __init__
workflow_runner.prepare_workflow(defn)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_runner.py", line 53, in prepare_workflow
self.create_instance(
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_runner.py", line 87, in create_instance
return _Instance(det, self._runner_class, self._restrictions)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_runner.py", line 107, in __init__
self._create_instance()
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_runner.py", line 118, in _create_instance
self._run_code(
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_runner.py", line 160, in _run_code
exec(code, self.globals_and_locals, self.globals_and_locals)
File "<string>", line 2, in <module>
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 441, in __call__
return self.current(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 232, in _import
new_spec.loader.exec_module(new_mod)
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/workspaces/go-run-ml/python/run_ml_worker_manager2.py", line 22, in <module>
from ml_activity import MachineLearningActivity
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 441, in __call__
return self.current(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 234, in _import
mod = importlib.__import__(name, globals, locals, fromlist, level)
File "<frozen importlib._bootstrap>", line 1109, in __import__
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/workspaces/go-run-ml/python/ml_activity.py", line 4, in <module>
import ml.pytorch.char_rnn2.CharRnnTrain as CharRnnTrain
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 441, in __call__
return self.current(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 234, in _import
mod = importlib.__import__(name, globals, locals, fromlist, level)
File "<frozen importlib._bootstrap>", line 1109, in __import__
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/workspaces/go-run-ml/python/ml/pytorch/char_rnn2/CharRnnTrain.py", line 4, in <module>
import torch
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 441, in __call__
return self.current(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 234, in _import
mod = importlib.__import__(name, globals, locals, fromlist, level)
File "<frozen importlib._bootstrap>", line 1109, in __import__
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/usr/local/lib/python3.9/dist-packages/torch/__init__.py", line 675, in <module>
from ._tensor import Tensor
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 441, in __call__
return self.current(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 234, in _import
mod = importlib.__import__(name, globals, locals, fromlist, level)
File "<frozen importlib._bootstrap>", line 1113, in __import__
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/usr/local/lib/python3.9/dist-packages/torch/_tensor.py", line 21, in <module>
from torch.overrides import (
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 441, in __call__
return self.current(*args, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/temporalio/worker/workflow_sandbox/_importer.py", line 234, in _import
mod = importlib.__import__(name, globals, locals, fromlist, level)
File "<frozen importlib._bootstrap>", line 1109, in __import__
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 790, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/usr/local/lib/python3.9/dist-packages/torch/overrides.py", line 1548, in <module>
has_torch_function = _add_docstr(
RuntimeError: function '_has_torch_function' already has a docstring
I found a way to manage this when I read through the documentation here:
I initially tried dealing with this by using this argument to the Worker:
workflow_runner=SandboxedWorkflowRunner(
restrictions=SandboxRestrictions.default.with_passthrough_modules("torch")
),
But, this causes the error:
Cannot access pathlib.Path.mkdir.call from inside a workflow. If this is code from a module not used in a workflow or known to only be used deterministically from a workflow, mark the import as pass through.
So I just imported my python script that has a dependency on pytorch like this:
with workflow.unsafe.imports_passed_through():
import ml.pytorch.char_rnn2.CharRnnTrain as CharRnnTrain
and it seems to be working now.