[英]how to convert pyspark data frame columns into a dict
I have a dataframe with 2 columns.我有一个带有 2 列的 dataframe。
Col1: String, Col2:String.
I want to create a dict like {'col1':'col2'}
.我想创建一个像{'col1':'col2'}
的字典。
For example, the below csv data:例如下面的 csv 数据:
var1,InternalCampaignCode
var2,DownloadFileName
var3,ExternalCampaignCode
has to become:必须变成:
{'var1':'InternalCampaignCode','var2':'DownloadFileName', ...}
The dataframe is having around 200 records. dataframe 有大约 200 条记录。
Please let me know how to achieve this.请让我知道如何实现这一目标。
The following should do the trick:以下应该可以解决问题:
df_as_dict = map(lambda row: row.asDict(), df.collect())
Note that this is going to generate a list of dictionaries, where each dictionary represents a single record of your pyspark dataframe:请注意,这将生成一个字典列表,其中每个字典代表 pyspark dataframe 的单个记录:
[
{'Col1': 'var1', 'Col2': 'InternalCampaignCode'},
{'Col1': 'var2', 'Col2': 'DownloadFileName'},
{'Col1': 'var3', 'Col3': 'ExternalCampaignCode'},
]
You can do a dict comprehension:你可以做一个 dict 理解:
result = {r[0]: r[1] for r in df.collect()}
which gives这使
{'var1': 'InternalCampaignCode', 'var2': 'DownloadFileName', 'var3': 'ExternalCampaignCode'}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.