简体   繁体   English

Python Pandas:排序和分组,然后对第二列的两个连续行求和,得出第三列的特定值

[英]Python Pandas: Sort and group by, then sum two consecutive rows of 2nd column for a specific value of a 3rd column

I have this dataframe: 我有这个数据框:

    Group   Turn    Name
0   G1       1      Maria
1   G1       2      Sam
2   G1       2      Sara
3   G1       3      Maria
4   G1       4      Mark
5   G1       5      Maria

6   G2       2      Maria
7   G2       1      Ahmad

8   G3       1      Maria
9   G3       2      David

I would like to group by my data based on value of column "group" and sort based on their "Turn". 我想根据“组”列的值对数据进行分组,并根据其“转弯”进行排序。 So with each group the turns are sorted. 因此,将每个组的转弯排序。

Then I would like to sum the value of column "Turn" in each group for the rows where the name is "Maria" and one row after. 然后,我想对名称为“ Maria”及其后一行的各行中的“ Turn”列的值求和。 IF Maria is the last turn in the group then the sum only Maria's turn. 如果Maria是该组中的最后一个回合,则仅是Maria的回合之和。

So the result looks like this:
    Group       Name    Sum 
        G1      Maria    3
        G1      Maria    7
        G1      Maria    5
        G2      Maria    2
        G3      Maria    3

I tried group by and apply and shift but none of them gives me the final result I am looking for. 我尝试了分组,应用和轮换,但没有一个能给我最终的结果。

 df = df.groupby('group').apply(lambda x: x.sort_values('Turn'))

Can somebody help me? 有人可以帮我吗?

Use: 采用:

df.set_index(['Group','Name',(df['Name'] == 'Maria').cumsum().rename('Occurance')])\
  .sum(level=[0,2])\
  .reset_index()\
  .assign(name='Maria')\
  .drop('Occurance', axis=1)

Output: 输出:

  Group  Turn   name
0    G1     3  Maria
1    G1     7  Maria
2    G1     5  Maria
3    G2     3  Maria
4    G3     3  Maria

You can using ffill with limit 您可以使用limit ffill

df=df.sort_values(['Group','Turn'])
df[df.Name.where(df.Name=='Maria').groupby(df['Group']).ffill(limit=1).eq('Maria')].set_index('Group').Turn.sum(level=0)
Out[272]: 
Group
G1    5
G2    3
G3    3
Name: Turn, dtype: int64

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas - 检查两列中的值,对第三列求和 - Python Pandas - check value in two columns, sum the 3rd column pandas groupby,然后按第二列聚合,在第三列找到对应的值 - pandas groupby, then aggregate by a 2nd column and find corresponding value in a 3rd column 如何从文本文件中切出第二列和第三列? 蟒蛇 - How to cut 2nd and 3rd column out of a textfile? python 如何在 df (Python) 中找到列的第二个(或第三个)最大值的索引? - How to find the index of the 2nd (or 3rd) largest value of a column in a df (Python)? 是否有任何 R/shell/Perl/python 代码根据第一列和第二列计算第三列的平均值 - Is there any R/ shell/Perl/python code to calculate average of 3rd column based on first and 2nd column 在 Python 中查找第 2 和第 3 大值 - Find 2nd and 3rd highest value in Python PANDAS python 引用第二个 dataframe 和分组标准以获得新列的值 - PANDAS python referencing 2nd dataframe and group criteria to obtain value of new column 第 3 列 pandas python 中至少有两列 - Minimum of two columns in a 3rd column pandas python Python 程序将特定值设置为特定列中的 2 行,并将第 3 行留空 - Python program to set a specific value to 2 rows in a particular column and leave the 3rd row blank [Kotlin/Python]我可以合并两个映射,所以第三个映射的键是第一个映射的值,第三个映射的值是第二个映射的值吗? - [Kotlin/Python]Can I merge two maps so the key of the 3rd map is the value of the 1st map and the value of the 3rd map is the value of the 2nd map?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM