[英]Vectorized solution for multiple column operation in a dataframe
i probably will have a large dataframe which has the following first row: 我可能会有一个较大的数据框,其中包含以下第一行:
BUCHDAT y y1 y2 y3 y4 y5 y6 y7
7 2017-02-26 577 30.0 622.0 1785.0 2633.0 422.0 10497.0 364.0
Now i want to replace the columns 'y' till 'y7' with a formula: 现在我想用公式替换“ y”至“ y7”列:
df['y'] = df['y'] - df['y1']
Is there any vectorization solution for this? 是否有矢量化解决方案? So I want to apply this formula in every column, for the next column the formula should be:
因此,我想将此公式应用于每一列,对于下一列,公式应为:
df['y1'] = df['y1']- df['y2']
you have any idea how to do it? 你有什么想法吗?
Use DataFrame.sub
with DataFrame.shift
: 将
DataFrame.sub
与DataFrame.shift
DataFrame.sub
使用:
df1 = df.iloc[:, 1:].astype(float)
df.iloc[:, 1:] = df1.sub(df1.shift(-1, axis=1))
print (df)
BUCHDAT y y1 y2 y3 y4 y5 y6 y7
7 2017-02-26 547.0 -592.0 -1163.0 -848.0 2211.0 -10075.0 10133.0 NaN
Here's one working with the underlying numpy arrays for a good performance: 这是使用底层numpy数组以取得良好性能的一种方法:
df.iloc[:,1:-1] = df.values[:,1:-1] - df.values[:,2:]
print(df)
BUCHDAT y y1 y2 y3 y4 y5 y6 y7
7 2017-02-26 547.0 -592.0 -1163.0 -848.0 2211.0 -10075.0 10133.0 364.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.