[英]Taking the min value of N last days
I have this data frame:我有这个数据框:
ID Date X 123_Var 456_Var 789_Var
A 16-07-19 3 777 250 810
A 17-07-19 9 637 121 529
A 20-07-19 2 295 272 490
A 21-07-19 3 778 600 544
A 22-07-19 6 741 792 907
A 25-07-19 6 435 416 820
A 26-07-19 8 590 455 342
A 27-07-19 6 763 476 753
A 02-08-19 6 717 211 454
A 03-08-19 6 152 442 475
A 05-08-19 6 564 340 302
A 07-08-19 6 105 929 633
A 08-08-19 6 948 366 586
B 07-08-19 4 509 690 406
B 08-08-19 2 413 725 414
B 12-08-19 2 170 702 912
B 13-08-19 3 851 616 477
B 14-08-19 9 475 447 555
B 15-08-19 1 412 403 708
B 17-08-19 2 299 537 321
B 18-08-19 4 310 119 125
I want to show the min
value of n
last days (say, n = 4
), using Date
column, excluding the value of current day .我想显示
n
最后天的min
(例如, n = 4
),使用Date
列,不包括 current day 的值。
A similar solution has provided by jezrael . jezrael提供了类似的解决方案。 (That one calculates the
mean
, and not min
.) (那个计算的
mean
,而不是min
。)
Expected result:预期结果:
ID Date X 123_Var 456_Var 789_Var 123_Var_4 456_Var_4 789_Var_4
A 16-07-19 3 777 250 810 NaN NaN NaN
A 17-07-19 9 637 121 529 777.0 250.0 810.0
A 20-07-19 2 295 272 490 637.0 121.0 529.0
A 21-07-19 3 778 600 544 295.0 121.0 490.0
A 22-07-19 6 741 792 907 295.0 272.0 490.0
A 25-07-19 6 435 416 820 741.0 600.0 544.0
A 26-07-19 8 590 455 342 435.0 416.0 820.0
A 27-07-19 6 763 476 753 435.0 416.0 342.0
A 02-08-19 6 717 211 454 NaN NaN NaN
A 03-08-19 6 152 442 475 717.0 211.0 454.0
A 05-08-19 6 564 340 302 152.0 211.0 454.0
A 07-08-19 6 105 929 633 152.0 340.0 302.0
A 08-08-19 6 948 366 586 105.0 340.0 302.0
B 07-08-19 4 509 690 406 NaN NaN NaN
B 08-08-19 2 413 725 414 509.0 690.0 406.0
B 12-08-19 2 170 702 912 413.0 725.0 414.0
B 13-08-19 3 851 616 477 170.0 702.0 414.0
B 14-08-19 9 475 447 555 170.0 616.0 477.0
B 15-08-19 1 412 403 708 170.0 447.0 477.0
B 17-08-19 2 299 537 321 412.0 403.0 477.0
B 18-08-19 4 310 119 125 299.0 403.0 321.0
Use similar solution like @Chris with custom lambda function in GroupBy.apply
and last join to original by DataFrame.join
:使用类似
GroupBy.apply
和 GroupBy.apply 中的自定义 lambda 函数的类似解决方案,最后通过DataFrame.join
加入原始DataFrame.join
:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
n = 4
cols = df.filter(regex='Var').columns
f = lambda x: x.asfreq('d').rolling(window=f'{n+1}D',closed="left")[cols].min()
df2 = (df.set_index('Date')
.groupby('ID').apply(f)
.add_suffix(f'_{n}'))
df = df.join(df2, on=['ID','Date'])
print (df)
ID Date X 123_Var 456_Var 789_Var 123_Var_4 456_Var_4 \
0 A 2019-07-16 3 777 250 810 NaN NaN
1 A 2019-07-17 9 637 121 529 777.0 250.0
2 A 2019-07-20 2 295 272 490 637.0 121.0
3 A 2019-07-21 3 778 600 544 295.0 121.0
4 A 2019-07-22 6 741 792 907 295.0 121.0
5 A 2019-07-25 6 435 416 820 295.0 272.0
6 A 2019-07-26 8 590 455 342 435.0 416.0
7 A 2019-07-27 6 763 476 753 435.0 416.0
8 A 2019-08-02 6 717 211 454 NaN NaN
9 A 2019-08-03 6 152 442 475 717.0 211.0
10 A 2019-08-05 6 564 340 302 152.0 211.0
11 A 2019-08-07 6 105 929 633 152.0 211.0
12 A 2019-08-08 6 948 366 586 105.0 340.0
13 B 2019-08-07 4 509 690 406 NaN NaN
14 B 2019-08-08 2 413 725 414 509.0 690.0
15 B 2019-08-12 2 170 702 912 413.0 690.0
16 B 2019-08-13 3 851 616 477 170.0 702.0
17 B 2019-08-14 9 475 447 555 170.0 616.0
18 B 2019-08-15 1 412 403 708 170.0 447.0
19 B 2019-08-17 2 299 537 321 170.0 403.0
20 B 2019-08-18 4 310 119 125 299.0 403.0
789_Var_4
0 NaN
1 810.0
2 529.0
3 490.0
4 490.0
5 490.0
6 544.0
7 342.0
8 NaN
9 454.0
10 454.0
11 302.0
12 302.0
13 NaN
14 406.0
15 406.0
16 414.0
17 477.0
18 477.0
19 477.0
20 321.0
One way using groupby
and rolling
:使用
groupby
和rolling
一种方法:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
dfs = []
for k, d in df.set_index('Date').groupby('ID'):
tmp = d.asfreq('1 D').filter(like='_Var').rolling('4D',closed="left").min().add_suffix("_4")
dfs.append(d.merge(tmp, left_index=True, right_index=True))
new_df = pd.concat(dfs).reset_index()
print(new_df)
Output:输出:
Date ID X 123_Var 456_Var 789_Var 123_Var_4 456_Var_4 789_Var_4
0 2019-07-16 A 3 777 250 810 NaN NaN NaN
1 2019-07-17 A 9 637 121 529 777.0 250.0 810.0
2 2019-07-20 A 2 295 272 490 637.0 121.0 529.0
3 2019-07-21 A 3 778 600 544 295.0 121.0 490.0
4 2019-07-22 A 6 741 792 907 295.0 272.0 490.0
5 2019-07-25 A 6 435 416 820 741.0 600.0 544.0
6 2019-07-26 A 8 590 455 342 435.0 416.0 820.0
7 2019-07-27 A 6 763 476 753 435.0 416.0 342.0
8 2019-08-02 A 6 717 211 454 NaN NaN NaN
9 2019-08-03 A 6 152 442 475 717.0 211.0 454.0
10 2019-08-05 A 6 564 340 302 152.0 211.0 454.0
11 2019-08-07 A 6 105 929 633 152.0 340.0 302.0
12 2019-08-08 A 6 948 366 586 105.0 340.0 302.0
13 2019-08-07 B 4 509 690 406 NaN NaN NaN
14 2019-08-08 B 2 413 725 414 509.0 690.0 406.0
15 2019-08-12 B 2 170 702 912 413.0 725.0 414.0
16 2019-08-13 B 3 851 616 477 170.0 702.0 912.0
17 2019-08-14 B 9 475 447 555 170.0 616.0 477.0
18 2019-08-15 B 1 412 403 708 170.0 447.0 477.0
19 2019-08-17 B 2 299 537 321 412.0 403.0 477.0
20 2019-08-18 B 4 310 119 125 299.0 403.0 321.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.