如何根据条件在 Pandas 中创建新列

Question

Quick silly question - I am sure this was asked before, but couldn't file detail.快速愚蠢的问题-我确定之前有人问过这个问题，但无法提交详细信息。 I have a dataframe df_students as below -我有一个数据框 df_students 如下 -

Student ID, Subjects ,  MArks_Received, Marks
222         English     3               90
222         Maths       3               80
222         Science     3               70
223         English     2               90
223         Maths       2               80
224         Maths       2               80

I am looking for below output based on Subjects and Received conditions, if no's of rows don't match for each student, will have to add extra Colum ( PENDING) or Received.我正在寻找基于主题和接收条件的以下输出，如果每个学生的行数不匹配，则必须添加额外的 Colum (PENDING) 或 Received。

Student ID, Subjects ,  Expected_Rows, Marks, State
222         English     3               90    Received  
222         Maths       3               80    Received
222         Science     3               70    Received
223         English     2               90    Received
223         Maths       2               80    Received
224         Maths       2               80    PENDING

As I have Expected_Rows 2 for "224" , but received only 1 , I should mark this as "Pending".由于我有 "224" 的 Expected_Rows 2 ，但只收到了 1 ，我应该将其标记为“Pending”。

I am able to aggregate sum of marks as below, but cant figure out how to add State.我能够汇总如下总分，但无法弄清楚如何添加状态。 Any help is highlight appreciated.任何帮助都值得赞赏。

Aggregate data frame聚合数据框

df_aggregate = df_students.groupby(['Student ', 'Marks'])['Marks'].agg(sum).reset_index()

Answer 1

There are many approaches, please see below if this helps:有很多方法，请参阅下面是否有帮助：

Add a new column 'count' and then 'State' basis that:添加一个新列'count' ，然后'State'基于：

df['Count'] = df.groupby('Student ID')['Student ID'].transform('count')
df['State'] = np.where(df['Count'] != df['MArks_Received'], 'PENDING','Received')

If you don't want to add a new column then use the following:如果您不想添加新列，请使用以下内容：

df['State'] = np.where(df.groupby('Student ID')['Student ID'].transform('count') != df['MArks_Received'], 'PENDING','Received')

It consider the rows where the count of 'Student ID' doesn't match with 'Expected Rows' .它考虑'Student ID'的计数与'Expected Rows'不匹配'Expected Rows' 。

如何根据条件在 Pandas 中创建新列

问题描述

Aggregate data frame聚合数据框

1 个解决方案

解决方案1
0 已采纳 2020-02-10 14:03:54

如何根据条件在 Pandas 中创建新列

问题描述

Aggregate data frame聚合数据框

1 个解决方案

解决方案1 0 已采纳 2020-02-10 14:03:54

解决方案1
0 已采纳 2020-02-10 14:03:54