[英]How to convert a list of dictionaries to a dataframe?
This is a list of dictionaries that I have which is to be converted to a dataframe.这是我要转换为数据框的字典列表。 I tried using multi-index but couldn't convert the whole dataframe.我尝试使用多索引但无法转换整个数据帧。
response = [{
"name": "xyz",
"empId": "007",
"details": [{
"address": [{
"street": "x street",
"city": "x city"
}, {
"street": "xx street",
"city": "xx city"
}],
"country": "xxz country"
},
{
"address": [{
"street": "y street",
"city": "y city"
}, {
"street": "yy street",
"city": "yy city"
}],
"country": "yyz country"
}
]
}]
I managed to get the inner list of dictionaries to a dataframe with the following code:我设法使用以下代码将字典的内部列表获取到数据框:
for i in details:
Country = i['country']
street =[]
city = []
index = pd.MultiIndex.from_arrays([[Country]*len(i['address']), list(range(1,len(i['address'])+1))], names=['Country', 'SL No'])
df=pd.DataFrame(columns=["Street","City"],index=index)
if i['address']:
for row in i['address']:
street.append(row['street'])
city.append(row['city'])
df["Street"]=street
df["City"]=city
frames.append(df)
df_final=pd.concat(frames)
Output obtained:获得的输出:
Country SL No Street City
xxz country 1 x street x city
2 xx street xx city
yyz country 1 y street y city
2 yy street yy city
How can I convert the list of dictionaries to a dataframe while keeping all the information?如何在保留所有信息的同时将字典列表转换为数据框?
The final output that I want:我想要的最终输出:
Name EmpId Country Street City
xyz 007 xxz country x street x city
xx street xx city
yyz country y street y city
yy street yy cit
Use json_normalize
with DataFrame.set_index
:将json_normalize
与DataFrame.set_index
json_normalize
使用:
df = pd.json_normalize(response,
record_path=['details','address'],
meta=['name','empId', ['address','country']]
)
df = df.set_index(['name','empId','address.country'])
print (df)
street city
name empId address.country
xyz 007 xxz country x street x city
xxz country xx street xx city
yyz country y street y city
yyz country yy street yy city
For older pandas versions use:对于较旧的熊猫版本,请使用:
df = pd.io.json.json_normalize(response,
record_path=['details','address'],
meta=['name','empId', ['address','country']]
)
EDIT:编辑:
Tested with multiple values and working well:使用多个值进行测试并且运行良好:
response = [{
"name": "xyz",
"empId": "007",
"details": [{
"address": [{
"street": "x street",
"city": "x city"
}, {
"street": "xx street",
"city": "xx city"
}],
"country": "xxz country"
},
{
"address": [{
"street": "y street",
"city": "y city"
}, {
"street": "yy street",
"city": "yy city"
}],
"country": "yyz country"
}
]
},
{
"name": "xyz1",
"empId": "0071",
"details": [{
"address": [{
"street": "x street1",
"city": "x city1"
}, {
"street": "xx stree1t",
"city": "xx city1"
}],
"country": "xxz country"
},
{
"address": [{
"street": "y street",
"city": "y city"
}, {
"street": "yy street",
"city": "yy city"
}],
"country": "yyz country"
}
]
}]
df = pd.json_normalize(response,
record_path=['details','address'],
meta=['name','empId', ['address','country']]
)
df = df.set_index(['name','empId','address.country'])
print (df)
street city
name empId address.country
xyz 007 xxz country x street x city
xxz country xx street xx city
yyz country y street y city
yyz country yy street yy city
xyz1 0071 xxz country x street1 x city1
xxz country xx stree1t xx city1
yyz country y street y city
yyz country yy street yy city
As far as I know, there is no easy way to do it since your data contains multiple levels of lists.据我所知,没有简单的方法可以做到,因为您的数据包含多个级别的列表。 Although a bit convoluted, the following should work.虽然有点令人费解,但以下应该有效。 The code will iteratively explode
lists and convert dictionaries to columns with json_normalize
.该代码将反复explode
列表和转换字典的列与json_normalize
。
df = pd.DataFrame.from_records(response)
df = df.explode('details', ignore_index=True)
df = pd.concat([df, pd.json_normalize(df['details'])], axis=1)
df = df.explode('address', ignore_index=True)
df = pd.concat([df, pd.json_normalize(df['address'])], axis=1)
df = df.drop(columns=['details', 'address'])
Result:结果:
name empId country street city
0 xyz 007 xxz country x street x city
1 xyz 007 xxz country xx street xx city
2 xyz 007 yyz country y street y city
3 xyz 007 yyz country yy street yy city
Note: For pandas versions older than 1.1.0, explode
do not have the ignore_index
parameter.注意:对于 1.1.0 之前的ignore_index
版本, explode
没有ignore_index
参数。 Instead, use reset_index(drop=True)
after the explode
.相反,在explode
后使用reset_index(drop=True)
。
In addition, in older pandas versions you need to use pd.io.json.json_normalize
instead of pd.json_normalize
.此外,在较旧的pd.io.json.json_normalize
版本中,您需要使用pd.io.json.json_normalize
而不是pd.json_normalize
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.