[英]Python: Pandas how to add a column to duplicated values of dataframe which are in ascending order?
我有以下数据框:
name date
test 2022-03-04
test 2022-03-05
test 2022-03-06
test 2022-03-17
test 2022-03-18
test 2022-03-21
test2 2022-03-04
test2 2022-03-05
test2 2022-03-15
test2 2022-03-19
test2 2022-03-21
test2 2022-04-16
test3 2022-03-14
test3 2022-03-15
test3 2022-03-23
test3 2022-03-27
test4 2022-03-20
test4 2022-04-15
test4 2022-04-17
test5 2022-03-01
test5 2022-03-04
test5 2022-03-06
test5 2022-03-12
test5 2022-04-04
test5 2022-04-10
test5 2022-04-14
test5 2022-05-04
test6 2022-03-05
test6 2022-03-15
test6 2022-06-20
test6 2022-06-24
test6 2022-06-27
如何为重复的旧记录添加一个值为 yes 的列old_data
组合(名称,日期),其数据至少大于 3 个值? 日期列按升序排列。
我想产生这个输出:
name date old_data
test 2022-03-04 yes
test 2022-03-05 yes
test 2022-03-06 yes
test 2022-03-17
test 2022-03-18
test 2022-03-21
test2 2022-03-04 yes
test2 2022-03-05 yes
test2 2022-03-15 yes
test2 2022-03-19
test2 2022-03-21
test2 2022-04-16
test3 2022-03-14 yes
test3 2022-03-15
test3 2022-03-23
test3 2022-03-27
test4 2022-03-20
test4 2022-04-15
test4 2022-04-17
test5 2022-03-01 yes
test5 2022-03-04 yes
test5 2022-03-06 yes
test5 2022-03-12 yes
test5 2022-04-04 yes
test5 2022-04-10
test5 2022-04-14
test5 2022-05-04
test6 2022-03-05 yes
test6 2022-03-15 yes
test6 2022-06-20
test6 2022-06-24
test6 2022-06-27
这是我的尝试:
df['old_data'] = np.where(df.groupby('name').cumcount().ge(4), 'yes', '')
使用GroupBy.cumcount
和ascending=False
进行计数器降序,而不是大于或等于4
使用3
:
df['old_data'] = np.where(df.groupby('name').cumcount(ascending=False).ge(3), 'yes', '')
GroupBy.rank
的另一个想法:
df['date'] = pd.to_datetime(df['date'])
m = df.groupby('name')['date'].rank(method='dense', ascending=False).gt(3)
df['old_data'] = np.where(m, 'yes', '')
print (df)
name date old_data
0 test 2022-03-04 yes
1 test 2022-03-05 yes
2 test 2022-03-06 yes
3 test 2022-03-17
4 test 2022-03-18
5 test 2022-03-21
6 test2 2022-03-04 yes
7 test2 2022-03-05 yes
8 test2 2022-03-15 yes
9 test2 2022-03-19
10 test2 2022-03-21
11 test2 2022-04-16
12 test3 2022-03-14 yes
13 test3 2022-03-15
14 test3 2022-03-23
15 test3 2022-03-27
16 test4 2022-03-20
17 test4 2022-04-15
18 test4 2022-04-17
19 test5 2022-03-01 yes
20 test5 2022-03-04 yes
21 test5 2022-03-06 yes
22 test5 2022-03-12 yes
23 test5 2022-04-04 yes
24 test5 2022-04-10
25 test5 2022-04-14
26 test5 2022-05-04
27 test6 2022-03-05 yes
28 test6 2022-03-15 yes
29 test6 2022-06-20
30 test6 2022-06-24
31 test6 2022-06-27
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.