[英]pandas groupby column to list and keep certain values
I have the following dataframe:我有以下 dataframe:
id occupations
111 teacher
111 student
222 analyst
333 cook
111 driver
444 lawyer
I create a new column with a list of the all the occupations:我创建了一个包含所有职业列表的新列:
new_df['occupation_list'] = df['id'].map(df.groupby('id')['occupations'].agg(list))
How do I only include teacher
and student
values in occupation_list
?我如何只在occupation_list
列表中包含teacher
和student
的价值观?
You can filter before groupby:您可以在 groupby 之前进行过滤:
to_map = (df[df['occupations'].isin(['teacher', 'student'])]
.groupby('id')['occupations'].agg(list)
)
df['occupation_list'] = df['id'].map(to_map)
Output: Output:
id occupations occupation_list
0 111 teacher [teacher, student]
1 111 student [teacher, student]
2 222 analyst NaN
3 333 cook NaN
4 111 driver [teacher, student]
5 444 lawyer NaN
You can also do你也可以做
df.groupby('id')['occupations'].transform(' '.join).str.split()
You would just do a groupby and agg the column to a list:您只需执行 groupby 并将列添加到列表中:
df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})
out:出去:
>>> df
id occupations
0 111 teacher
1 111 student
2 222 analyst
3 333 cook
4 111 driver
5 444 lawyer
>>> df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})
id occupations
0 111 [teacher, student, driver]
1 222 [analyst]
2 333 [cook]
3 444 [lawyer]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.