I have the dataset df. I would like to get the last stage for each name as a new column.
Name Stage stage_number
a Open 1
a Paid 2
a Transit 3
a Wait 4
a Complete 5
b Open 1
b Paid 2
b Transit 3
b Wait 4
b Canceled 5
Expected Output:
Name Stage stage_number Last_Stage
a Open 1 Complete
a Paid 2 Complete
a Transit 3 Complete
a Wait 4 Complete
a Complete 5 Complete
b Open 1 Cancelled
b Paid 2 Cancelled
b Transit 3 Cancelled
b Wait 4 Cancelled
b Canceled 5 Cancelled
I tried the below code but get an error,
def stage(df):
for x in df['Name']:
return df['Stage'].iloc[-1]
df['last_stage'] = df.apply(stage, axis = 1)
df
My error
AttributeError: 'str' object has no attribute 'iloc'
Does this work for you?
df["last_stage"] = df.groupby("Name")["Stage"].transform("last")
print(df)
Name Stage stage_number last_stage
0 a Open 1 Complete
1 a Paid 2 Complete
2 a Transit 3 Complete
3 a Wait 4 Complete
4 a Complete 5 Complete
5 b Open 1 Canceled
6 b Paid 2 Canceled
7 b Transit 3 Canceled
8 b Wait 4 Canceled
9 b Canceled 5 Canceled
Cameron's solution is better, but, if you really want to go by your function, you can do it like this:
def stage(df):
for name, group in df.groupby('Name'):
for i in range(0, len(group)):
return group['Stage'].iloc[-1]
df['last_stage'] = df.apply(stage, axis = 1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.