简体   繁体   English

pandas groupby 列列出并保留某些值

[英]pandas groupby column to list and keep certain values

I have the following dataframe:我有以下 dataframe:

id       occupations
111      teacher
111      student
222      analyst
333      cook
111      driver
444      lawyer

I create a new column with a list of the all the occupations:我创建了一个包含所有职业列表的新列:

new_df['occupation_list'] = df['id'].map(df.groupby('id')['occupations'].agg(list))

How do I only include teacher and student values in occupation_list ?我如何只在occupation_list列表中包含teacherstudent的价值观?

You can filter before groupby:您可以在 groupby 之前进行过滤:

to_map = (df[df['occupations'].isin(['teacher', 'student'])]
             .groupby('id')['occupations'].agg(list)
         )

df['occupation_list'] = df['id'].map(to_map)

Output: Output:

    id occupations     occupation_list
0  111     teacher  [teacher, student]
1  111     student  [teacher, student]
2  222     analyst                 NaN
3  333        cook                 NaN
4  111      driver  [teacher, student]
5  444      lawyer                 NaN

You can also do你也可以做

df.groupby('id')['occupations'].transform(' '.join).str.split()

You would just do a groupby and agg the column to a list:您只需执行 groupby 并将列添加到列表中:

df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})

out:出去:

>>> df
    id occupations
0  111     teacher
1  111     student
2  222     analyst
3  333        cook
4  111      driver
5  444      lawyer
>>> df.groupby('id',as_index=False).agg({'occupations':lambda x: x.tolist()})
    id                 occupations
0  111  [teacher, student, driver]
1  222                   [analyst]
2  333                      [cook]
3  444                    [lawyer]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM