[英]ImportError: No module named tensorflow_transform.beam
When submitting a Dataflow job to GCP I get this error:向 GCP 提交 Dataflow 作业时,出现此错误:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 766, in run
self._load_main_session(self.local_staging_directory)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 482, in _load_main_session
pickler.load_session(session_file)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 266, in load_session
return dill.load_session(file_path)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 402, in load_session
module = unpickler.load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1139, in load_reduce
value = func(*args)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 818, in _import_module
return __import__(import_name)
ImportError: No module named tensorflow_transform
My assumption is that requirements such as tensorflow-transform and apache-beam are pre-installed and it used to work a few months ago.我的假设是预先安装了 tensorflow-transform 和 apache-beam 等要求,并且它在几个月前就可以使用了。
Here is the solution, putting it up here for people who are facing the same issue.这是解决方案,将其放在此处供面临相同问题的人使用。
You need to have setup.py file in the same directory as the file you are running, assuming that the file has all the beam steps.假设文件包含所有梁步骤,您需要在与您正在运行的文件相同的目录中包含 setup.py 文件。
import setuptools
setuptools.setup(
name='whatever-name',
version='0.0.1',
install_requires=[
'apache-beam==2.10.0',
'tensorflow-transform==0.12.0'
],
packages=setuptools.find_packages(),
)
In the python file I had在我有的python文件中
options = PipelineOptions()
which had to be changed to:必须改为:
options = PipelineOptions(setup_file="./setup.py")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.