I have a dictionary that looks like this:
{'136454': [{'city': 'Kabul', 'country': 'AF'}],
'137824': [{'city': 'Kabul', 'country': 'AF'}],
'134134': [{'city': 'Kabul', 'country': 'AF'}],
'138322': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'}],
'137246': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'}, {'city': 'New Delhi', 'country': 'IN'],
'133141': [{'city': 'Kabul', 'country': 'AF'}]}
What I would like is a dataframe that looks like this:
'136454' | 'Kabul'|'AF'
'137824' | 'Kabul'|'AF'
'134134' | 'Kabul'|'AF'
'138322' |'Fujairah'| 'AE'
'138322' | 'Kabul'| 'AF'
'137246' | 'Fujairah'| 'AE'
'137246' | 'Kabul' | 'AE'
'137246' | 'New Delhi'| 'IN'
'133141'| 'Kabul'| 'AF'
What I'm getting at the moment is only the first value for each key. Not very good at pandas, so a bit confused.
Let us do explode
Notice this function avaliable after pandas 0.25
df=pd.Series(d).explode().apply(pd.Series)
Iterate through the dictionary, appending the main key to the internal dict, and finally create your dataframe:
d = []
for k,v in data.items():
for ent in v:
#this is where you append the main key to the internal dictionary
ent.update({"key":k})
d.append(ent)
#get your dataframe
pd.DataFrame(d)
city country key
0 Kabul AF 136454
1 Kabul AF 137824
2 Kabul AF 134134
3 Fujairah AE 138322
4 Kabul AF 138322
5 Fujairah AE 137246
6 Kabul AF 137246
7 New Delhi IN 137246
8 Kabul AF 133141
Another possible solution, you can "flat" you dict
data = {'136454': [{'city': 'Kabul', 'country': 'AF'}],
'137824': [{'city': 'Kabul', 'country': 'AF'}],
'134134': [{'city': 'Kabul', 'country': 'AF'}],
'138322': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'}],
'137246': [{'city': 'Fujairah', 'country': 'AE'},
{'city': 'Kabul', 'country': 'AF'},
{'city': 'New Delhi', 'country': 'IN'}],
'133141': [{'city': 'Kabul', 'country': 'AF'}]}
new_data = []
for key, value in data.items():
for arr_value in value:
arr_value['id'] = key
new_data.append(arr_value)
print(new_data)
df = pd.DataFrame.from_dict(new_data)
print(df.head())
You can use a list comprehension and then pass to pd.DataFrame
:
import pandas as pd
d = {'136454': [{'city': 'Kabul', 'country': 'AF'}], '137824': [{'city': 'Kabul', 'country': 'AF'}], '134134': [{'city': 'Kabul', 'country': 'AF'}], '138322': [{'city': 'Fujairah', 'country': 'AE'}, {'city': 'Kabul', 'country': 'AF'}], '137246': [{'city': 'Fujairah', 'country': 'AE'}, {'city': 'Kabul', 'country': 'AF'}, {'city': 'New Delhi', 'country': 'IN'}], '133141': [{'city': 'Kabul', 'country': 'AF'}]}
data = [[a, i['city'], i['country']] for a, b in d.items() for i in b]
>>> pd.DataFrame(data)
Output:
0 1 2
0 136454 Kabul AF
1 137824 Kabul AF
2 134134 Kabul AF
3 138322 Fujairah AE
4 138322 Kabul AF
5 137246 Fujairah AE
6 137246 Kabul AF
7 137246 New Delhi IN
8 133141 Kabul AF
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.