[英]How to convert dataframe column which contains list of dictionary into separate columns?
I have a dataframe column which looks like this:我有一个 dataframe 列,如下所示:
df_cost['region.localCurrency']:
0 [{'content': 'Dirham', 'languageCode': 'EN'}]
1 [{'content': 'Dirham', 'languageCode': 'EN'}]
2 [{'content': 'Dirham', 'languageCode': 'EN'}]
3 [{'content': 'Euro', 'languageCode': 'DE'}]
4 [{'content': 'Euro', 'languageCode': 'DE'}]
5 [{'content': 'Euro', 'languageCode': 'DE'}]
6 [{'content': 'Euro', 'languageCode': 'DE'}]
7 [{'content': 'Euro', 'languageCode': 'DE'}]
8 [{'content': 'Euro', 'languageCode': 'DE'}]
9 [{'content': 'Euro', 'languageCode': 'DE'}]
10 [{'content': 'Euro', 'languageCode': 'DE'}]
11 [{'content': 'Euro', 'languageCode': 'DE'}]
12 [{'content': 'Euro', 'languageCode': 'DE'}]
13 [{'content': 'Dirham', 'languageCode': 'EN'}]
14 [{'content': 'Dirham', 'languageCode': 'EN'}]
15 [{'content': 'Dirham', 'languageCode': 'EN'}]
16 [{'content': 'Euro', 'languageCode': 'DE'}]
17 [{'content': 'Euro', 'languageCode': 'DE'}]
18 [{'content': 'Euro', 'languageCode': 'DE'}]
19 [{'content': 'Euro', 'languageCode': 'DE'}]
Name: region.localCurrency, dtype: object
and I want to convert it, to separate the dictionary keys and values into columns.我想转换它,将字典键和值分成列。 I want to add two separate columns to the initial df_cost dataframe, like 'localCurrencyContent' and 'localCurrencyCode', based on the dictionary contents of region.localCurrency.我想根据 region.localCurrency 的字典内容向初始 df_cost dataframe 添加两个单独的列,例如“localCurrencyContent”和“localCurrencyCode”。 I tried to split the region.localCurrency column like:我试图像这样拆分 region.localCurrency 列:
df_split=pd.DataFrame(df_cost['region.localCurrency'].apply(pd.Series), columns=['localCurrencyContent', 'localCurrencyCode'])
print(df_split)
but this gives me NaN values for the localCurrencyContent and localCurrencyCode, instead of 'Euro' and 'DE' for example.但这给了我 localCurrencyContent 和 localCurrencyCode 的 NaN 值,而不是例如“Euro”和“DE”。 How could I split the column "region.localCurrency" and add the two created columns to the cost_df, initial dataframe?我如何拆分列“region.localCurrency”并将创建的两个列添加到 cost_df,初始 dataframe?
Pandas.json_normalize will probably do the job for you. Pandas.json_normalize 可能会为您完成这项工作。 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html
Use json_normalize
with convert first values by indexing:使用json_normalize
通过索引转换第一个值:
d = {'content':'localCurrencyContent','languageCode':'localCurrencyCode'}
df1 = pd.json_normalize(df_cost.pop('region.localCurrency').str[0]).rename(columns=d)
df = df_cost.join(df1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.