[英]How to transform a list of dictionaries, containing nested lists into a pandas df
I have a list of dicts:我有一个字典列表:
list_of_dicts = [{'name': 'a', 'counts': [{'dog': 2}]},
{'name': 'b', 'counts': [{'cat': 1}, {'capibara': 5}, {'whale': 10}]},
{'name': 'c', 'counts': [{'horse':1}, {'cat': 1}]]
I would like to transform this into a pandas dataframe like so:我想将其转换为 pandas dataframe ,如下所示:
Name![]() |
Animal![]() |
Frequency![]() |
---|---|---|
a![]() |
dog![]() |
2 ![]() |
b ![]() |
cat![]() |
1 ![]() |
b ![]() |
capibara![]() |
5 ![]() |
b ![]() |
whale![]() |
10 ![]() |
c ![]() |
horse![]() |
1 ![]() |
c ![]() |
cat![]() |
1 ![]() |
In the current code, I try to normalize it:在当前代码中,我尝试对其进行规范化:
from pandas import json_normalize
df = json_normalize(list_of_dicts, 'counts')
But I think I am going in the wrong direction.但我认为我走错了方向。 Also, if I do a simple
df = pd.DataFrame(list_of_dicts)
, it results in each list of dicts being a single row value, which is not desired.此外,如果我做一个简单的
df = pd.DataFrame(list_of_dicts)
,它会导致每个 dicts 列表都是一个单行值,这是不希望的。
record_path
and meta
parameters of pandas.json_normalize
must be used.pandas.json_normalize
的record_path
和meta
参数。import pandas as pd
# test data
list_of_dicts = [{'name': 'a', 'counts': [{'dog': 2}]}, {'name': 'b', 'counts': [{'cat': 1}, {'capibara': 5}, {'whale': 10}]}, {'name': 'c', 'counts': [{'horse':1}, {'cat': 1}]}]
# load and transform the dataframe
pd.json_normalize(list_of_dicts, 'counts', 'name').set_index('name').stack().reset_index().rename(columns={'level_1': 'Animal', 0: 'Frequency'})
# display(df)
name Animal Frequency
0 a dog 2.0
1 b cat 1.0
2 b capibara 5.0
3 b whale 10.0
4 c horse 1.0
5 c cat 1.0
Try json_normalize
with melt
:尝试
json_normalize
与melt
:
(pd.json_normalize(list_of_dicts, record_path='counts', meta='name')
.melt('name', var_name='Animal', value_name='Frequency')
.dropna()
)
Output: Output:
name Animal Frequency
0 a dog 2.0
7 b cat 1.0
11 c cat 1.0
14 b capibara 5.0
21 b whale 10.0
28 c horse 1.0
Try this?尝试这个?
>>> pd.json_normalize(list_of_dicts, 'counts').melt().dropna()
You can also use df.explode
with df.apply
:您还可以将
df.explode
与df.apply
一起使用:
In [50]: df = pd.DataFrame(list_of_dicts).explode('counts')
In [74]: df.counts = df.counts.apply(lambda x: list(x.items())[0])
In [77]: df[['Animal', 'Frequency']] = pd.DataFrame(df['counts'].tolist(), index=df.index)
In [79]: df.drop('counts', 1, inplace=True)
In [80]: df
Out[80]:
name Animal Frequency
0 a dog 2
1 b cat 1
1 b capibara 5
1 b whale 10
2 c horse 1
2 c cat 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.