简体   繁体   English

我如何使用 python 检索熊猫分组的最后一列行值?

[英]How do i use python to retrieve the last column row value for a pandas group-by?

I have the dataset df.我有数据集df。 I would like to get the last stage for each name as a new column.我想将每个名称的最后一个阶段作为一个新列。

Name     Stage     stage_number
a        Open          1
a        Paid          2
a        Transit       3
a        Wait          4
a        Complete      5
b        Open          1
b        Paid          2
b        Transit       3
b        Wait          4
b        Canceled      5

Expected Output:预期输出:

Name     Stage     stage_number   Last_Stage
a        Open          1           Complete
a        Paid          2           Complete
a        Transit       3           Complete
a        Wait          4           Complete
a        Complete      5           Complete
b        Open          1           Cancelled
b        Paid          2           Cancelled
b        Transit       3           Cancelled
b        Wait          4           Cancelled
b        Canceled      5           Cancelled

I tried the below code but get an error,我尝试了下面的代码,但得到一个错误,

def stage(df):
    for x in df['Name']:
        return df['Stage'].iloc[-1]

df['last_stage'] = df.apply(stage, axis = 1)
df

My error我的错误

AttributeError: 'str' object has no attribute 'iloc'

Does this work for you?这对你有用吗?

df["last_stage"] = df.groupby("Name")["Stage"].transform("last")

print(df)
  Name     Stage  stage_number last_stage
0    a      Open             1   Complete
1    a      Paid             2   Complete
2    a   Transit             3   Complete
3    a      Wait             4   Complete
4    a  Complete             5   Complete
5    b      Open             1   Canceled
6    b      Paid             2   Canceled
7    b   Transit             3   Canceled
8    b      Wait             4   Canceled
9    b  Canceled             5   Canceled

Cameron's solution is better, but, if you really want to go by your function, you can do it like this: Cameron 的解决方案更好,但是,如果你真的想按照你的功能去做,你可以这样做:

def stage(df):
    for name, group in df.groupby('Name'):
            for i in range(0, len(group)):
                 return group['Stage'].iloc[-1]

df['last_stage'] = df.apply(stage, axis = 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM