[英]How to install python dependencies for dataflow
我有一個非常小的python數據流包,包的結構看起來像這樣
.
├── __pycache__
├── pubsubtobigq.py
├── requirements.txt
└── venv
requirements.txt
的內容是
protobuf==3.11.2
protobuf3-to-dict==0.1.5
我使用此代碼運行我的管道
python -m pubsubtobigq \
--input_topic "projects/project_name/topics/topic_name" \
--job_name "job_name" \
--output "gs://mybucket/wordcount/outputs" \
--runner DataflowRunner \
--project "project_name" \
--region "us-central1" \
--temp_location "gs://mybucket/tmp/" \
--staging_location "gs://mybucket/staging" \
--requirements_file requirements.txt \
--streaming True
使用這個庫的代碼就像
from protobuf_to_dict import protobuf_to_dict
def parse_proto(message):
dictinoary = protobuf_to_dict(message)
但是這條線沒有說protobuf_to_dict
是一個未知的符號。 即使我嘗試使用來自google.protobuf.json_format
谷歌內置方法MessageToDict
,我google.protobuf.json_format
得到同樣的錯誤。
我怎樣才能解決這個問題? 我需要安裝這些庫中的任何一個
編輯
當我使用來自google.protobuf.json_format
MessageToDict
時出現錯誤消息
Error processing instruction -31. Original traceback is Traceback (most recent call last): File
"apache_beam/runners/common.py", line 813, in apache_beam.runners.common.DoFnRunner.process File
"apache_beam/runners/common.py", line 449, in
apache_beam.runners.common.SimpleInvoker.invoke_process File "/Users/username/repos/
dataflow-pipeline/venv/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1415, in
wrapper = lambda x: [fn(x)] File "/Users/username/repos/dataflow-pipeline/pubsubtobigq.py", line
16, in parse_proto NameError: name 'MessageToDict' is not defined During handling of the above
exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/
python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 143, in _execute response
= task() File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 193, in lambda: self.create_worker().do_instruction(request), request) File "/usr/local/lib/
python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 291, in do_instruction
request.instruction_id) File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/
sdk_worker.py", line 317, in process_bundle bundle_processor.process_bundle(instruction_id)) File
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 675,
in process_bundle data.transform_id].process_encoded(data.data) File "/usr/local/lib/python3.7/
site-packages/apache_beam/runners/worker/bundle_processor.py", line 146, in process_encoded
self.output(decoded_value) File "apache_beam/runners/worker/operations.py", line 258, in
apache_beam.runners.worker.operations.Operation.output File "apache_beam/runners/worker/
operations.py", line 259, in apache_beam.runners.worker.operations.Operation.output File
"apache_beam/runners/worker/operations.py", line 146, in
apache_beam.runners.worker.operations.SingletonConsumerSet.receive File "apache_beam/runners/
worker/operations.py", line 596, in apache_beam.runners.worker.operations.DoOperation.process File
"apache_beam/runners/worker/operations.py", line 597, in
apache_beam.runners.worker.operations.DoOperation.process File "apache_beam/runners/common.py",
line 809, in apache_beam.runners.common.DoFnRunner.receive File "apache_beam/runners/common.py",
line 815, in apache_beam.runners.common.DoFnRunner.process File "apache_beam/runners/common.py",
line 882, in apache_beam.runners.common.DoFnRunner._reraise_augmented File "/usr/local/lib/
python3.7/site-packages/future/utils/init.py", line 421, in raise_with_traceback raise
exc.with_traceback(traceback) File "apache_beam/runners/common.py", line 813, in
apache_beam.runners.common.DoFnRunner.process File "apache_beam/runners/common.py", line 449, in
apache_beam.runners.common.SimpleInvoker.invoke_process File "/Users/username/repos/
dataflow-pipeline/venv/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1415, in
wrapper = lambda x: [fn(x)] File
"/Users/username/repos/dataflow-pipeline/pubsubtobigq.py", line 16, in parse_proto NameError:
name 'MessageToDict' is not defined [while running 'generatedPtransform-23']
當我使用protobuf_to_dict
時出現錯誤消息
Error processing instruction -32. Original traceback is Traceback (most recent call last): File
"apache_beam/runners/common.py", line 813, in apache_beam.runners.common.DoFnRunner.process File
"apache_beam/runners/common.py", line 449, in
apache_beam.runners.common.SimpleInvoker.invoke_process File "/Users/username/repos/
dataflow-pipeline/venv/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1415, in
wrapper = lambda x: [fn(x)] File "/Users/username/repos/dataflow-pipeline/pubsubtobigq.py", line
21, in parse_proto NameError: name 'protobuf_to_dict' is not defined During handling of the above
exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/
python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 143, in _execute response
= task() File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 193, in lambda: self.create_worker().do_instruction(request), request) File "/usr/local/lib/
python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 291, in do_instruction
request.instruction_id) File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/
sdk_worker.py", line 317, in process_bundle bundle_processor.process_bundle(instruction_id)) File
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 675,
in process_bundle data.transform_id].process_encoded(data.data) File "/usr/local/lib/python3.7/
site-packages/apache_beam/runners/worker/bundle_processor.py", line 146, in process_encoded
self.output(decoded_value) File "apache_beam/runners/worker/operations.py", line 258, in
apache_beam.runners.worker.operations.Operation.output File "apache_beam/runners/worker/
operations.py", line 259, in apache_beam.runners.worker.operations.Operation.output File
"apache_beam/runners/worker/operations.py", line 146, in
apache_beam.runners.worker.operations.SingletonConsumerSet.receive File "apache_beam/runners/
worker/operations.py", line 596, in apache_beam.runners.worker.operations.DoOperation.process File
"apache_beam/runners/worker/operations.py", line 597, in
apache_beam.runners.worker.operations.DoOperation.process File "apache_beam/runners/common.py",
line 809, in apache_beam.runners.common.DoFnRunner.receive File "apache_beam/runners/common.py",
line 815, in apache_beam.runners.common.DoFnRunner.process File "apache_beam/runners/common.py",
line 882, in apache_beam.runners.common.DoFnRunner._reraise_augmented File "/usr/local/lib/
python3.7/site-packages/future/utils/init.py", line 421, in raise_with_traceback raise
exc.with_traceback(traceback) File "apache_beam/runners/common.py", line 813, in
apache_beam.runners.common.DoFnRunner.process File "apache_beam/runners/common.py", line 449, in
apache_beam.runners.common.SimpleInvoker.invoke_process File "/Users/username/repos/
dataflow-pipeline/venv/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1415, in
wrapper = lambda x: [fn(x)] File "/Users/username/repos/dataflow-pipeline/pubsubtobigq.py", line
21, in parse_proto NameError: name 'protobuf_to_dict' is not defined [while running
'generatedPtransform-22']
數據流工作人員無法查看全局依賴項https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors
根據 quimiluzon@ 的建議,如果您的工作與DirectRunner
一起使用,請嘗試。 如果是這樣,這可能會起作用:
def parse_proto(message):
from protobuf_to_dict import protobuf_to_dict
dictinoary = protobuf_to_dict(message)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.