[英]DataFrame groupby a column which has dictionary values
I'm having a dataframe which contains a column as dictionary.我有一个 dataframe ,其中包含一个作为字典的列。 And I need to groupby the column by the dictionary values.我需要按字典值对列进行分组。 For example,例如,
import pandas as pd
data = [
{
"name":"xx",
"values":{
"element":[
{
"path":"path1/id1"
},
{
"path":"path2/id1"
}
],
"nonrequired":[
{}
]
}
},
{
"name":"yy",
"values":{
"element":[
{
"path":"path1/id2"
},
{
"path":"path2/id2"
}
],
"nonrequired":[
{}
]
}
}
]
df = pd.DataFrame(data)
result = {
'path1': [
{
"name":'xx',
"renamecolumn":['id1','id2']
}
],
'path2': [
{
"name":'yy',
"renamecolumn":['id1','id2']
}
]
}
Still not 100% sure of the logic of the final dictionary creation as the example input and output don't quite match up.仍然不能 100% 确定最终字典创建的逻辑作为示例输入,并且 output 不太匹配。 However, here is how you can extract the values and you can create your desired dictionary from there.但是,您可以通过以下方式提取值,然后从那里创建所需的字典。
# ectract the values and split them on the forward slash
df['split'] = df['values'].apply(lambda x: [item['path'].split('/') for item in x['element']])
# generate the path and ids columns
df['path'] = df['split'].apply(lambda x: [x[i][0] for i in range(0,len(x))])
df['ids'] = df['split'].apply(lambda x: [x[i][1] for i in range(0,len(x))])
# separate out all the lists and
result = df.drop(['values', 'split'], axis=1) \
.explode('ids').explode('path').drop_duplicates()
Result
is: Result
是:
name path ids
0 xx path1 id1
0 xx path2 id1
1 yy path1 id2
1 yy path2 id2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.