![](/img/trans.png)
[英]How to create a dictionary with two dataframe columns in pyspark?
[英]How to split list of dictionary in one column into two columns in pyspark dataframe?
我想將上面的 spark dataframe 的過濾地址列拆分為兩個新列,即標志和地址:
customer_id|pincode|filteredaddress| Flag| Address
1000045801 |121005 |[{'flag':'0', 'address':'House number 172, Parvatiya Colony Part-2 , N.I.T'}]
1000045801 |121005 |[{'flag':'1', 'address':'House number 172, Parvatiya Colony Part-2 , N.I.T'}]
1000045801 |121005 |[{'flag':'1', 'address':'House number 172, Parvatiya Colony Part-2 , N.I.T'}]
誰能告訴我我該怎么做?
您可以使用以下鍵從filteredaddress
地址 map 列中獲取值:
df2 = df.selectExpr(
'customer_id', 'pincode',
"filteredaddress['flag'] as flag", "filteredaddress['address'] as address"
)
訪問 map 值的其他方法是:
import pyspark.sql.functions as F
df.select(
'customer_id', 'pincode',
F.col('filteredaddress')['flag'],
F.col('filteredaddress')['address']
)
# or, more simply
df.select(
'customer_id', 'pincode',
'filteredaddress.flag',
'filteredaddress.address'
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.