[英]How to calculate multiple row difference in pandas?
I want to calculate difference between each row and the next 5 rows and return the maximum of all those values (non Nan only) and repeat the same operation for all the rows in pandas dataframe and finally print the values in a new column.我想计算每一行和接下来的 5 行之间的差异,并返回所有这些值的最大值(仅限非 Nan),并对 pandas dataframe 中的所有行重复相同的操作,最后在新列中打印这些值。 I have tried
.shift(1)
function and tried doing it iteratively for all the rows, but it seems very slow.我已经尝试过
.shift(1)
function 并尝试对所有行进行迭代,但它似乎很慢。
A' B' Output
AA 1 4
BB 2 3
CC 3 2
DD 4 1
EE 5 0
import pandas as pd
df = pd.DataFrame({'a': [1,2,3,4,5,0,5,1,4,3,2]})
col_n = []
diff_r = 5
for i in range(1, diff_r+1):
col_n.append('d_'+str(i))
df['d_'+str(i)] = df['a'].diff(i).shift(periods=-i)
df['d_abs_max'] = df[col_n].abs().max(axis=1)
df['d_max'] = df[col_n].max(axis=1)
print(df)
a d_1 d_2 d_3 d_4 d_5 d_abs_max d_max
0 1 1.0 2.0 3.0 4.0 -1.0 4.0 4.0
1 2 1.0 2.0 3.0 -2.0 3.0 3.0 3.0
2 3 1.0 2.0 -3.0 2.0 -2.0 3.0 2.0
3 4 1.0 -4.0 1.0 -3.0 0.0 4.0 1.0
4 5 -5.0 0.0 -4.0 -1.0 -2.0 5.0 0.0
5 0 5.0 1.0 4.0 3.0 2.0 5.0 5.0
6 5 -4.0 -1.0 -2.0 -3.0 NaN 4.0 -1.0
7 1 3.0 2.0 1.0 NaN NaN 3.0 3.0
8 4 -1.0 -2.0 NaN NaN NaN 2.0 -1.0
9 3 -1.0 NaN NaN NaN NaN 1.0 -1.0
10 2 NaN NaN NaN NaN NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.