繁体   English   中英

使用 Apache Beam python 创建谷歌云数据流模板时出现 RuntimeValueProviderError

[英]RuntimeValueProviderError when creating a google cloud dataflow template with Apache Beam python

我无法使用 python 3.7 暂存云数据流模板。 它在apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible的一个参数化参数上失败

使用 python 2.7 暂存模板工作正常。

我试过用 3.7 运行数据流作业,它们工作正常。 只有模板暂存被破坏。 数据流模板中仍然不支持 python 3.7 还是 python 3 中的暂存语法发生了变化?

这是管道部分

class WordcountOptions(PipelineOptions):
  @classmethod
  def _add_argparse_args(cls, parser):
    parser.add_value_provider_argument(
      '--input',
      default='gs://dataflow-samples/shakespeare/kinglear.txt',
      help='Path of the file to read from',
      dest="input")

def main(argv=None):
  options = PipelineOptions(flags=argv)
  setup_options = options.view_as(SetupOptions)

  wordcount_options = options.view_as(WordcountOptions)

  with beam.Pipeline(options=setup_options) as p:
    lines = p | 'read' >> ReadFromText(wordcount_options.input)

if __name__ == '__main__':
  main()

这是带有暂存脚本的完整存储库https://github.com/firemuzzy/dataflow-templates-bug-python3

以前有一个类似的问题,但我不确定它是如何相关的,因为它是在 python 2.7 中完成的,但是我的模板在 2.7 中阶段很好,但在 3.7 中失败了

如何在 Python 中创建 Google Cloud Dataflow Wordcount 自定义模板?

**** 堆栈跟踪 ****

Traceback (most recent call last):
  File "run_pipeline.py", line 44, in <module>
    main()
  File "run_pipeline.py", line 41, in main
    lines = p | 'read' >> ReadFromText(wordcount_options.input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 906, in __ror__
    return self.transform.__ror__(pvalueish, self.label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 490, in apply
    return self.apply(transform, pvalueish)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/textio.py", line 542, in expand
    return pvalue.pipeline | Read(self._source)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1020, in apply_Read
    return self.apply_PTransform(transform, pbegin, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 863, in expand
    return pbegin | _SDFBoundedSourceWrapper(self.source)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pvalue.py", line 113, in __or__
    return self.pipeline.apply(ptransform, self)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1543, in expand
    | core.ParDo(self._create_sdf_bounded_source_dofn()))
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1517, in _create_sdf_bounded_source_dofn
    estimated_size = source.estimate_size()
  File "/usr/local/lib/python3.7/site-packages/apache_beam/options/value_provider.py", line 136, in _f
    raise error.RuntimeValueProviderError('%s not accessible' % obj)
apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible

不幸的是,Apache Beam 的 Python SDK 2.18.0 上的模板似乎已损坏。

目前,解决方案是避免使用 Beam 2.18.0,因此在您的需求/依赖项中,定义apache-beam[gcp]<2.18.0apache-beam[gcp]>2.18.0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM