简体   繁体   中英

How to convert dataframe column which contains list of dictionary into separate columns?

I have a dataframe column which looks like this:

df_cost['region.localCurrency']:

0     [{'content': 'Dirham', 'languageCode': 'EN'}]
1     [{'content': 'Dirham', 'languageCode': 'EN'}]
2     [{'content': 'Dirham', 'languageCode': 'EN'}]
3       [{'content': 'Euro', 'languageCode': 'DE'}]
4       [{'content': 'Euro', 'languageCode': 'DE'}]
5       [{'content': 'Euro', 'languageCode': 'DE'}]
6       [{'content': 'Euro', 'languageCode': 'DE'}]
7       [{'content': 'Euro', 'languageCode': 'DE'}]
8       [{'content': 'Euro', 'languageCode': 'DE'}]
9       [{'content': 'Euro', 'languageCode': 'DE'}]
10      [{'content': 'Euro', 'languageCode': 'DE'}]
11      [{'content': 'Euro', 'languageCode': 'DE'}]
12      [{'content': 'Euro', 'languageCode': 'DE'}]
13    [{'content': 'Dirham', 'languageCode': 'EN'}]
14    [{'content': 'Dirham', 'languageCode': 'EN'}]
15    [{'content': 'Dirham', 'languageCode': 'EN'}]
16      [{'content': 'Euro', 'languageCode': 'DE'}]
17      [{'content': 'Euro', 'languageCode': 'DE'}]
18      [{'content': 'Euro', 'languageCode': 'DE'}]
19      [{'content': 'Euro', 'languageCode': 'DE'}]
Name: region.localCurrency, dtype: object

and I want to convert it, to separate the dictionary keys and values into columns. I want to add two separate columns to the initial df_cost dataframe, like 'localCurrencyContent' and 'localCurrencyCode', based on the dictionary contents of region.localCurrency. I tried to split the region.localCurrency column like:

df_split=pd.DataFrame(df_cost['region.localCurrency'].apply(pd.Series), columns=['localCurrencyContent', 'localCurrencyCode'])
print(df_split)

but this gives me NaN values for the localCurrencyContent and localCurrencyCode, instead of 'Euro' and 'DE' for example. How could I split the column "region.localCurrency" and add the two created columns to the cost_df, initial dataframe?

Pandas.json_normalize will probably do the job for you. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

Use json_normalize with convert first values by indexing:

d = {'content':'localCurrencyContent','languageCode':'localCurrencyCode'}
df1 = pd.json_normalize(df_cost.pop('region.localCurrency').str[0]).rename(columns=d)
df = df_cost.join(df1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM