减去Pandas或Pyspark Dataframe中的连续列

Question

I would like to perform the following operation in a pandas or pyspark dataframe but i still havent found a solution. 我想在pandas或pyspark数据帧中执行以下操作，但我还没有找到解决方案。

I want to subtract the values from consecutive columns in a dataframe. 我想从数据帧中的连续列中减去这些值。

The operation I am describing can be seen in the image below. 我所描述的操作可以在下图中看到。

Bear in mind that the output dataframe wont have any values on first column as the first column in the input table cannot be subtracted by its previous one as it doesn't exist. 请记住，输出数据框在第一列上没有任何值，因为输入表中的第一列不能被前一列减去，因为它不存在。

Answer 1

diff has an axis param so you can just do this in one step: diff有一个axis参数，所以你可以一步完成：

In [63]:
df = pd.DataFrame(np.random.rand(3, 4), ['row1', 'row2', 'row3'], ['A', 'B', 'C', 'D'])
df

Out[63]:
             A         B         C         D
row1  0.146855  0.250781  0.766990  0.756016
row2  0.528201  0.446637  0.576045  0.576907
row3  0.308577  0.592271  0.553752  0.512420

In [64]:
df.diff(axis=1)

Out[64]:
       A         B         C         D
row1 NaN  0.103926  0.516209 -0.010975
row2 NaN -0.081564  0.129408  0.000862
row3 NaN  0.283694 -0.038520 -0.041331

Answer 2

df = pd.DataFrame(np.random.rand(3, 4), ['row1', 'row2', 'row3'], ['A', 'B', 'C', 'D'])
df.T.diff().T

减去Pandas或Pyspark Dataframe中的连续列

问题描述

2 个解决方案

解决方案1
3 已采纳 2016-07-12 08:10:26

解决方案2
1 2016-07-12 06:35:14

减去Pandas或Pyspark Dataframe中的连续列

问题描述

2 个解决方案

解决方案1 3 已采纳 2016-07-12 08:10:26

解决方案2 1 2016-07-12 06:35:14

解决方案1
3 已采纳 2016-07-12 08:10:26

解决方案2
1 2016-07-12 06:35:14