[英]Python Pandas new dataframe column with group by and condition
I have a Pandas dataframe that looks as follows. 我有一个Pandas数据框,如下所示。
player count1 count2
A 1 1
A 2 1
A 3 1
A 4 2
A 5 2
B 1 1
B 2 2
B 3 2
B 4 2
Column player
contains names, count1
is a cumulative sum and column count2
contains other counts. 列
player
包含名称, count1
是累积总和,列count2
包含其他计数。
I now want to create a new column that contains the value of count1
where the column count2
first contains the value 2
. 我现在想要创建一个包含
count1
值的新列,其中列count2
首先包含值2
。
Hence, the result should look like this: 因此,结果应如下所示:
player count1 count2 new
A 1 1 4
A 2 1 4
A 3 1 4
A 4 2 4
A 5 2 4
B 1 1 2
B 2 2 2
B 3 2 2
B 4 2 2
I tried to do it with transform
, but I cannot figure out how to combine it with the condition based on the count2
column (and the tanking the value of the count1
column). 我尝试用
transform
来做,但我无法弄清楚如何将它与基于count2
列的条件(以及count1
列的值的坦克)结合起来。
Without the groupby
it works like this, but I don't know where and how to add the groupby
: 没有
groupby
就像这样,但我不知道在哪里以及如何添加groupby
:
df['new'] = df.loc[matches['count2'] == 2, 'count1'].min()
Use map
by Series
: 按
Series
使用map
:
s = df[df['count2'] == 2].drop_duplicates(['player']).set_index('player')['count1']
df['new'] = df['player'].map(s)
print (df)
player count1 count2 new
0 A 1 1 4
1 A 2 1 4
2 A 3 1 4
3 A 4 2 4
4 A 5 2 4
5 B 1 1 2
6 B 2 2 2
7 B 3 2 2
8 B 4 2 2
Detail : 细节 :
First filter only 2
rows by boolean indexing
: 首先通过
boolean indexing
仅过滤2
行:
print (df[df['count2'] == 2])
player count1 count2
3 A 4 2
4 A 5 2
6 B 2 2
7 B 3 2
8 B 4 2
And then remove dupes by player
column by drop_duplicates
: 然后通过
drop_duplicates
按player
列删除欺骗:
print (df[df['count2'] == 2].drop_duplicates(['player']))
player count1 count2
3 A 4 2
6 B 2 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.