[英]Creating a new column based on if-elif-else condition in python pandas
[英]If-elif-else combined with group-by to create new column
我有前 4 列,我想创建 *5th 列:
user date visit_num total_visits_user *last_cust__visit*
1 1995-10-01 1 2 1995-10-02
1 1995-10-02 2 2 1995-10-02
2 1995-10-01 1 3 1995-10-03
2 1995-10-02 2 3 1995-10-03
2 1995-10-03 3 3 1995-10-03
3 1995-10-01 1 5 1995-10-05
3 1995-10-02 2 5 1995-10-05
3 1995-10-03 3 5 1995-10-05
3 1995-10-04 4 5 1995-10-05
3 1995-10-05 5 5 1995-10-05
4 1995-10-03 1 2 1995-10-04
4 1995-10-04 2 2 1995-10-04
*last_cust_visit是一个新列,显示客户上次访问的日期。
我试过 if, elif, else 与 groupby 结合,但不幸的是我无法让它工作。
任何帮助将不胜感激。 谢谢
您可以对user
进行分组以获取最大date
并将其与原始数据帧合并:
df['last_cust_visit'] = df.merge(df.groupby('user')['date'].max()
.reset_index(), on='user', suffixes=('_', '')
)['date']
它给出了预期的:
user date visit_num total_visits_user last_cust_visit
0 1 1995-10-01 1 2 1995-10-02
1 1 1995-10-02 2 2 1995-10-02
2 2 1995-10-01 1 3 1995-10-03
3 2 1995-10-02 2 3 1995-10-03
4 2 1995-10-03 3 3 1995-10-03
5 3 1995-10-01 1 5 1995-10-05
6 3 1995-10-02 2 5 1995-10-05
7 3 1995-10-03 3 5 1995-10-05
8 3 1995-10-04 4 5 1995-10-05
9 3 1995-10-05 5 5 1995-10-05
10 4 1995-10-03 1 2 1995-10-04
11 4 1995-10-04 2 2 1995-10-04
简单的方法是使用pd.groupby
transform
方法:
df["last_cust_visit"] = df.groupby("user")["date"].transform('max')
使用transform
输出数据帧将具有与df
相同的行数:
user date visit_num total_visits_user last_cust_visit
0 1 1995-10-01 1 2 1995-10-02
1 1 1995-10-02 2 2 1995-10-02
2 2 1995-10-01 1 3 1995-10-03
3 2 1995-10-02 2 3 1995-10-03
4 2 1995-10-03 3 3 1995-10-03
5 3 1995-10-01 1 5 1995-10-05
6 3 1995-10-02 2 5 1995-10-05
7 3 1995-10-03 3 5 1995-10-05
8 3 1995-10-04 4 5 1995-10-05
9 3 1995-10-05 5 5 1995-10-05
10 4 1995-10-03 1 2 1995-10-04
11 4 1995-10-04 2 2 1995-10-04
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.