[英]Select some columns from PCollection (Apache Beam, Python)
I have the following PCollection:我有以下 PCollection:
And I want to select only 2 columns from that PCollection.我只想 select 来自该 PCollection 的 2 列。 I tried to do:我试着做:
def cut_data(data):
return data[["WebSpeedRef", "WebSpeedAct"]]
data_min = data_json | 'min' >> beam.Map(cut_data)
but got an error.但出现错误。 What is the simplest way to accomplish this.完成此操作的最简单方法是什么。
You could do this:你可以这样做:
disired_columns = (
dataset
| beam.Map(lambda x: [x["column1"], x["column2"]])
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.