簡體   English   中英

GCP 數據流 - WriteToBigQuery() 期間出現 NoneType 錯誤

[英]GCP Dataflow - NoneType error during WriteToBigQuery()

我正在嘗試使用光束將 csv 文件中的數據從 GCS 傳輸到 BQ,但是當我調用 WriteToBigQuery 時出現 NoneType 錯誤。 錯誤信息:

AttributeError: 'NoneType' object has no attribute 'items' [while running 'Write to BQ/_StreamToBigQuery/StreamInsertRows/ParDo(BigQueryWriteFn)']

我的管道代碼:

import apache_beam as beam
from apache_beam.pipeline import PipelineOptions
from apache_beam.io.textio import ReadFromText


options = {
    'project': project,
    'region': region,
    'temp_location': bucket
    'staging_location': bucket
    'setup_file': './setup.py'
}


class Split(beam.DoFn):
    def process(self, element):
        n, cc = element.split(",")
        return [{
            'n': int(n.strip('"')),
            'connection_country': str(cc.strip()),
        }]


pipeline_options = beam.pipeline.PipelineOptions(flags=[], **options)

with beam.Pipeline(options=pipeline_options) as pipeline:
    (pipeline
        | 'Read from GCS' >> ReadFromText('file_path*', skip_header_lines=1)
        | 'parse input' >> beam.ParDo(Split())
        | 'print' >> beam.Map(print)
        | 'Write to BQ' >> beam.io.WriteToBigQuery(
            'from_gcs', 'demo', schema='n:INTEGER, connection_country:STRING',
            create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
            write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE)
        )

我的 csv 看起來像這樣:

在此處輸入圖像描述

print() 階段的光束摘錄如下所示:

在此處輸入圖像描述

感謝任何幫助!

我正在嘗試使用 Beam 將 csv 文件中的數據從 GCS 傳輸到 BQ,但是當我調用 WriteToBigQuery 時出現 NoneType 錯誤。 錯誤信息:

AttributeError: 'NoneType' object has no attribute 'items' [while running 'Write to BQ/_StreamToBigQuery/StreamInsertRows/ParDo(BigQueryWriteFn)']

我的管道代碼:

import apache_beam as beam
from apache_beam.pipeline import PipelineOptions
from apache_beam.io.textio import ReadFromText


options = {
    'project': project,
    'region': region,
    'temp_location': bucket
    'staging_location': bucket
    'setup_file': './setup.py'
}


class Split(beam.DoFn):
    def process(self, element):
        n, cc = element.split(",")
        return [{
            'n': int(n.strip('"')),
            'connection_country': str(cc.strip()),
        }]


pipeline_options = beam.pipeline.PipelineOptions(flags=[], **options)

with beam.Pipeline(options=pipeline_options) as pipeline:
    (pipeline
        | 'Read from GCS' >> ReadFromText('file_path*', skip_header_lines=1)
        | 'parse input' >> beam.ParDo(Split())
        | 'print' >> beam.Map(print)
        | 'Write to BQ' >> beam.io.WriteToBigQuery(
            'from_gcs', 'demo', schema='n:INTEGER, connection_country:STRING',
            create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
            write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE)
        )

我的 csv 看起來像這樣:

在此處輸入圖片說明

print() 階段的光束摘錄如下所示:

在此處輸入圖片說明

感謝任何幫助!

您可以使用過濾掉 None 類型的消息

def filter_none_messages(msg):
    print(F"Message filtered: {msg}")
    return msg

並添加| "FilterNoneMessages" >> beam.Filter(filter_none_messages) 管道中的| "FilterNoneMessages" >> beam.Filter(filter_none_messages)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM