[英]How to groupby a dataframe based on list elements in a columns
I have a dataframe like this:我有一个这样的数据框:
movie_id genres
0 2 [1,2]
1 3 [1,3]
2 4 [2,4]
I want to make groups of movies (with duplication) according to genre types.我想根据流派类型制作电影组(有重复)。 Like this:
像这样:
genre_group movie_id genres
0 1 2 [1,2]
1 3 [1,3]
0 2 2 [1,2]
2 4 [2,4]
1 3 3 [1,3]
2 4 4 [2,4]
IIUC, you can use explode
and map
. IIUC,您可以使用
explode
和map
。
df1 = df.explode('genres').sort_values('genres').rename(
columns={'genres' : 'genres_group'})\
.set_index('genres_group',append=True)
df1['genres'] = df1.index.get_level_values(0).map(df['genres'])
print(df1)
movie_id genres
genres_group
0 1 2 [1, 2]
1 1 3 [1, 3]
0 2 2 [1, 2]
2 2 4 [2, 4]
1 3 3 [1, 3]
2 4 4 [2, 4]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.