[英]Replacing last n rows from current row with the current row value in group by using Python
Category Timestamp Feat1 Feat2 Indicator
AA 01-02-2018 06:10 21 22
AA 02-02-2018 06:10 22 6
AA 03-02-2018 06:10 26 27
AA 07-02-2018 06:10 27 22
AA 08-02-2018 06:10 13 19
AA 09-02-2018 06:10 20 9 1
AA 10-02-2018 06:10 9 17
XX 04-02-2018 06:10 21 22
XX 05-02-2018 06:10 22 6
XX 06-02-2018 06:10 26 27
XX 07-02-2018 06:10 27 22 1
XX 08-02-2018 06:10 13 19
XX 09-02-2018 06:10 20 9
XX 10-02-2018 06:10 9 17
所需的输出:(用当前行值(如果等于 1)替换最后 3 行和 group by)
Category Timestamp Feat1 Feat2 Indicator Indicator (Required)
AA 01-02-2018 06:10 21 22
AA 02-02-2018 06:10 22 6
AA 03-02-2018 06:10 26 27 1
AA 07-02-2018 06:10 27 22 1
AA 08-02-2018 06:10 13 19 1
AA 09-02-2018 06:10 20 9 1 1
AA 10-02-2018 06:10 9 17
XX 04-02-2018 06:10 21 22 1
XX 05-02-2018 06:10 22 6 1
XX 06-02-2018 06:10 26 27 1
XX 07-02-2018 06:10 27 22 1 1
XX 08-02-2018 06:10 13 19
XX 09-02-2018 06:10 20 9
XX 10-02-2018 06:10 9 17
使用带有limit
参数的GroupBy.bfill
:
#if necessary
df['Indicator'] = df['Indicator'].replace('', np.nan)
df['Indicator1'] = df.groupby('Category')['Indicator'].bfill(limit=3)
print (df)
Category Timestamp Feat1 Feat2 Indicator Indicator1
0 AA 01-02-2018 06:10 21 22 NaN NaN
1 AA 02-02-2018 06:10 22 6 NaN NaN
2 AA 03-02-2018 06:10 26 27 NaN 1.0
3 AA 07-02-2018 06:10 27 22 NaN 1.0
4 AA 08-02-2018 06:10 13 19 NaN 1.0
5 AA 09-02-2018 06:10 20 9 1.0 1.0
6 AA 10-02-2018 06:10 9 17 NaN NaN
7 XX 04-02-2018 06:10 21 22 NaN 1.0
8 XX 05-02-2018 06:10 22 6 NaN 1.0
9 XX 06-02-2018 06:10 26 27 NaN 1.0
10 XX 07-02-2018 06:10 27 22 1.0 1.0
11 XX 08-02-2018 06:10 13 19 NaN NaN
12 XX 09-02-2018 06:10 20 9 NaN NaN
13 XX 10-02-2018 06:10 9 17 NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.