[英]Google Cloud Dataflow (Python): function to read from and write to a .csv file?
I am not able to figure out the precise functions in GCP Dataflow Python SDK that read from and write to csv files (or any non-txt files for that matter).我无法弄清楚 GCP 数据流 Python SDK 中读取和写入 csv 文件(或与此相关的任何非 txt 文件)的精确函数。 For BigQuery, I have figured out the following functions:
对于 BigQuery,我已经弄清楚了以下功能:
beam.io.Read(beam.io.BigQuerySource('%Table_ID%')) beam.io.Write(beam.io.BigQuerySink('%Table_ID%')) beam.io.Read(beam.io.BigQuerySource('%Table_ID%')) beam.io.Write(beam.io.BigQuerySink('%Table_ID%'))
For reading textfiles, the ReadFromText and WriteToText functions are known to me.对于读取文本文件,我知道 ReadFromText 和 WriteToText 函数。
However, I am not able to find any examples for GCP Dataflow Python SDK in which data is written to or read from csv files.但是,我无法找到 GCP 数据流 Python SDK 的任何示例,其中数据写入或读取 csv 文件。 Please could you provide the GCP Dataflow Python SDK functions for reading from and writing to csv files in the same manner as I have done for the functions relating to BigQuery above?
请提供 GCP 数据流 Python SDK 函数,以便按照我为上述 BigQuery 相关函数所做的相同方式读取和写入 csv 文件?
There is a CsvFileSource
in the beam_utils
PyPi package repository, that reads.csv files, deals with file headers, and can set custom delimiters. beam_utils
PyPi package 存储库中有一个CsvFileSource
,它读取 .csv 个文件,处理文件头,并可以设置自定义分隔符。 More information on how to use this source in this answer .有关如何在此答案中使用此来源的更多信息。 Hope that helps!
希望有帮助!
CSV files are text files. CSV 文件是文本文件。 The simplest (though somewhat inelegant) way of reading them would be to do a
ReadFromText
, and then split the lines read on the commas (eg beam.Map(lambda x: x.split(','))
).阅读它们的最简单(尽管有些不雅)的方法是执行
ReadFromText
,然后拆分以逗号读取的行(例如beam.Map(lambda x: x.split(','))
)。
For the more elegant option, check out this question , or simply use the beam_utils
pip repository and use the beam_utils.sources.CsvFileSource
source to read from.对于更优雅的选项,请查看此问题,或者简单地使用
beam_utils
pip 存储库并使用beam_utils.sources.CsvFileSource
源来读取。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.