Python apache beam ImportError: No module named *** on dataflow worker
简介:一些本地软件包有效,有些则无效
我的梁应用程序的结构:
1 2 3 4 5 6 7 8 9 10 11 12 13 | -setup.py -app/__init__.py -app/main.py -package1/__init__.py -package1/one.py -package2/__init__.py -package2/two.py -package3/__init__.py -package3/three.py |
在main.py中:
1 2 3 | from package1 import one from package2 import two from package3 import three |
在setup.py中
1 2 3 4 5 6 7 8 9 10 11 12 13 | import setuptools setuptools.setup( name='beam', version='1.0', install_requires=['apache-beam[gcp]', 'google-cloud==0.34.0', 'google-cloud-bigquery==0.25.0', 'requests==2.19.1', 'google-cloud-storage==1.12.0' ], packages=setuptools.find_packages(), ) |
运行时,使用
有直接跑步者(本地跑),没问题。
使用DataflowRunner(发送到gogole数据流),
我有这个错误:
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Traceback (most recent call last):
File"/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 642, in do_work
work_executor.execute()
File"/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 156, in execute
op.start()
File"apache_beam/runners/worker/operations.py", line 344, in apache_beam.runners.worker.operations.DoOperation.start
def start(self):
File"apache_beam/runners/worker/operations.py", line 345, in apache_beam.runners.worker.operations.DoOperation.start
with self.scoped_start_state:
File"apache_beam/runners/worker/operations.py", line 350, in apache_beam.runners.worker.operations.DoOperation.start
pickler.loads(self.spec.serialized_fn))
File"/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 244, in loads
return dill.loads(s)
File"/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 316, in loads
return load(file, ignore)
File"/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 304, in load
obj = pik.load()
File"/usr/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File"/usr/lib/python2.7/pickle.py", line 1096, in load_global
klass = self.find_class(module, name)
File"/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 465, in find_class
return StockUnpickler.find_class(self, module, name)
File"/usr/lib/python2.7/pickle.py", line 1130, in find_class
import(module)
ImportError: No module named three
这是"有点"令人沮丧,因为我加倍/三/ ...检查这些包之间可能有什么区别,它们是相同的。 Sane
有没有人有这个问题的解决方案?
谢谢。
这已经差不多一年了,但我有一个非常相似的问题,并且能够解决它,所以发布其他人绊到这个页面。
就我而言,
虽然我不完全理解根本原因,但运行文件调用