Python Pandas新的dataframe列有group by和condition

Question

I have a Pandas dataframe that looks as follows. 我有一个Pandas数据框，如下所示。

player  count1  count2
A       1       1
A       2       1
A       3       1
A       4       2
A       5       2
B       1       1
B       2       2
B       3       2
B       4       2

Column player contains names, count1 is a cumulative sum and column count2 contains other counts. 列player包含名称， count1是累积总和，列count2包含其他计数。

I now want to create a new column that contains the value of count1 where the column count2 first contains the value 2 . 我现在想要创建一个包含count1值的新列，其中列count2首先包含值2 。

Hence, the result should look like this: 因此，结果应如下所示：

player  count1  count2  new
A       1       1       4
A       2       1       4
A       3       1       4
A       4       2       4
A       5       2       4
B       1       1       2
B       2       2       2
B       3       2       2
B       4       2       2

I tried to do it with transform , but I cannot figure out how to combine it with the condition based on the count2 column (and the tanking the value of the count1 column). 我尝试用transform来做，但我无法弄清楚如何将它与基于count2列的条件（以及count1列的值的坦克）结合起来。

Without the groupby it works like this, but I don't know where and how to add the groupby : 没有groupby就像这样，但我不知道在哪里以及如何添加groupby ：

df['new'] = df.loc[matches['count2'] == 2, 'count1'].min()

Answer 1

Use map by Series : 按Series使用map ：

s = df[df['count2'] == 2].drop_duplicates(['player']).set_index('player')['count1']

df['new'] = df['player'].map(s)
print (df)
  player  count1  count2  new
0      A       1       1    4
1      A       2       1    4
2      A       3       1    4
3      A       4       2    4
4      A       5       2    4
5      B       1       1    2
6      B       2       2    2
7      B       3       2    2
8      B       4       2    2

Detail : 细节：

First filter only 2 rows by boolean indexing : 首先通过boolean indexing仅过滤2行：

print (df[df['count2'] == 2])
  player  count1  count2
3      A       4       2
4      A       5       2
6      B       2       2
7      B       3       2
8      B       4       2

And then remove dupes by player column by drop_duplicates : 然后通过drop_duplicates按player列删除欺骗：

print (df[df['count2'] == 2].drop_duplicates(['player']))
  player  count1  count2
3      A       4       2
6      B       2       2

Python Pandas新的dataframe列有group by和condition

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-07-29 13:15:07

Python Pandas新的dataframe列有group by和condition

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-07-29 13:15:07

解决方案1
3 已采纳 2018-07-29 13:15:07