簡體   English   中英

Google Cloud Dataflow(Python SDK):工作流程失敗| 每次工作進程最終失去與服務的聯系

[英]Google Cloud Dataflow (Python SDK) : Workflow failed | Each time the worker process eventually lost contact with the service

我建立了一個工作流程,從Google Cloud Storage提取數據,在ParDo中執行轉換並將輸出轉儲到BigQuery。

import apache_beam as beam
import logging

class ParseValidateRecordDoFn(beam.DoFn):
    def process(self, context):
        # All transformations come here    
        return custom_object

        try: 
            data = json.loads(context)
            yield beam.pvalue.TaggedOutput('PASS', data)

        except:
            print "ERROR"
            yield beam.pvalue.TaggedOutput('ERROR', context)

job_name = JOB_NAME
project = PROJECT_NAME
staging_location = STAGING_LOCATION
temp_location = TEMP_LOCATION

p = beam.Pipeline(argv=[
        '--job_name', job_name,
        '--project', project,
        '--staging_location', staging_location,
        '--temp_location', temp_location,
        '--no_save_main_session',
        '--runner', 'DataflowRunner',
        '--num_workers', '25',
        '--requirements_file', 'requirements.txt'])

text = p | "Reading Source" >> beam.io.ReadFromText('SOURCE LOCATION')

output_validate = text | beam.ParDo(ParseValidateRecordDoFn()).with_outputs('PASS','ERROR', main='main')

(output_validate.PASS | "Writing to BQ" >> beam.io.Write(beam.io.BigQuerySink('Table_name',
                                      create_disposition=beam.io.BigQueryDisposition.CREATE_NEVER,
                                      write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND, validate=True)))


(output_validate.ERROR | "Writing UNPARSED File" >> beam.io.WriteToText('ERROR_LOCATION'))

logging.getLogger().setLevel(logging.INFO)
p.run().wait_until_finish()

從本周開始,代碼開始引發錯誤:

錯誤消息截屏

錯誤堆棧跟蹤:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 778, in run
    deferred_exception_details=deferred_exception_details)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 630, in do_work
    exception_details=exception_details)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", line 168, in wrapper
    return fun(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 491, in report_completion_status
    exception_details=exception_details)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 299, in report_status
    work_executor=self._work_executor)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py", line 342, in report_status
    append_counter(work_item_status, counter)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py", line 38, in append_counter
    if isinstance(counter.name, counters.CounterName):
AttributeError: 'module' object has no attribute 'CounterName'

我嘗試過的事情:

  • 將代碼分解為最原始的形式。
  • 刪除對本地目錄的IO操作以進行調試
  • 在DF Runner上嘗試Hello World代碼
  • 切換到高內存工作人員

以上所有均未導致成功,所有這些都引發了與以下相同的錯誤:一個工作項嘗試了4次而沒有成功。 每次工人最終失去與服務的聯系。

提前致謝 :)

相同的代碼在DirectRunner的本地計算機上正常工作。 當我們刪除此代碼對PANDAS部分的引用時,該代碼正在執行,沒有任何問題。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM