简体   繁体   English

lambda/groupby 条件创建是/否列 - python、pandas

[英]lambda/groupby conditions creating yes/no column - python, pandas

i have a df which i was able to filter with lambda/groupby, however, instead of filter, i'd like to add a new column indicting whether it meets the condition.我有一个 df,我可以使用 lambda/groupby 进行过滤,但是,我想添加一个新列来指示它是否满足条件,而不是过滤器。 however, am getting error when using apply.但是,使用 apply 时出现错误。

kindly take note that if hf >= 5 "all" are filtered, therefore, for the new column, it should indicate 'no' if 1 or more of the values in the group are >=5请注意,如果 hf >= 5 "all" 被过滤,因此,对于新列,如果组中的 1 个或多个值 >=5,则应指示“否”

original code that was used for filtering:用于过滤的原始代码:

df=df.groupby('id').filter(lambda x: ((x.hr >= 5)).all())
data = {
    "id": [11111,11111,11111,11111,
           333,333,333,333,333,333,
            5678,5678,5678,5678,5678,],
    "hr": [4,2,5,4,5,7,6,8,5,6,7,8,6,2,4,],
    "new_col": ['no','no','no','no','yes','yes',
                'yes','yes','yes','yes','no','no','no','no','no',]}
df = pd.DataFrame(data)

original table:原表:

╔═══════╦════╗
║  id   ║ hr ║
╠═══════╬════╣
║ 11111 ║  4 ║
║ 11111 ║  2 ║
║ 11111 ║  5 ║
║ 11111 ║  4 ║
║   333 ║  5 ║
║   333 ║  7 ║
║   333 ║  6 ║
║   333 ║  8 ║
║   333 ║  5 ║
║   333 ║  6 ║
║  5678 ║  7 ║
║  5678 ║  8 ║
║  5678 ║  6 ║
║  5678 ║  2 ║
║  5678 ║  4 ║
╚═══════╩════╝

result trying to get结果试图得到

╔═══════╦════╦═════════╗
║  id   ║ hr ║ new_col ║
╠═══════╬════╬═════════╣
║ 11111 ║  4 ║ no      ║
║ 11111 ║  2 ║ no      ║
║ 11111 ║  5 ║ no      ║
║ 11111 ║  4 ║ no      ║
║   333 ║  5 ║ yes     ║
║   333 ║  7 ║ yes     ║
║   333 ║  6 ║ yes     ║
║   333 ║  8 ║ yes     ║
║   333 ║  5 ║ yes     ║
║   333 ║  6 ║ yes     ║
║  5678 ║  7 ║ no      ║
║  5678 ║  8 ║ no      ║
║  5678 ║  6 ║ no      ║
║  5678 ║  2 ║ no      ║
║  5678 ║  4 ║ no      ║
╚═══════╩════╩═════════╝

kindly advise.好心提醒。 thank you谢谢你

You can use groupby_transform where a lambda evaluates the condition and then use np.where to assign "yes", "no" values:您可以使用groupby_transform ,其中 lambda 评估条件,然后使用np.where分配“是”、“否”值:

df['new_col'] = np.where(df.groupby('id')['hr'].transform(lambda x: (x>=5).all()), 'yes', 'no')

Output: Output:

       id  hr new_col
0   11111   4      no
1   11111   2      no
2   11111   5      no
3   11111   4      no
4     333   5     yes
5     333   7     yes
6     333   6     yes
7     333   8     yes
8     333   5     yes
9     333   6     yes
10   5678   7      no
11   5678   8      no
12   5678   6      no
13   5678   2      no
14   5678   4      no

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM