简体   繁体   中英

Select some columns from PCollection (Apache Beam, Python)

I have the following PCollection: 在此处输入图像描述

And I want to select only 2 columns from that PCollection. I tried to do:

def cut_data(data):
    return data[["WebSpeedRef", "WebSpeedAct"]]

data_min = data_json | 'min' >> beam.Map(cut_data)

but got an error. What is the simplest way to accomplish this.

You could do this:

disired_columns = (
    dataset
    | beam.Map(lambda x: [x["column1"], x["column2"]])
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM