[英]How to efficiently subtract each row from pandas dataframe?
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)),
columns=list('ABCD'))
dfs = []
for index in range(len(df)):
subtracted = df - df.loc[index]
dfs.append(subtracted)
有沒有辦法做到這一點與申請? 對於大型數據框,像上面那樣做會變得很慢...
IIUC:
樣本DF:
In [124]: df = pd.DataFrame(np.arange(9).reshape(3,3), columns=list('abc'))
In [125]: df
Out[125]:
a b c
0 0 1 2
1 3 4 5
2 6 7 8
獲得dfs
:
In [126]: (df.values - df.values[:, None])
Out[126]:
array([[[ 0, 0, 0],
[ 3, 3, 3],
[ 6, 6, 6]],
[[-3, -3, -3],
[ 0, 0, 0],
[ 3, 3, 3]],
[[-6, -6, -6],
[-3, -3, -3],
[ 0, 0, 0]]])
subtracted
:
In [127]: (df.values - df.values[:, None])[-1]
Out[127]:
array([[-6, -6, -6],
[-3, -3, -3],
[ 0, 0, 0]])
一些解釋:
df.values[:, None]
是df.values [:, np.newaxis]的同義詞:
In [132]: df.values[:, np.newaxis]
Out[132]:
array([[[0, 1, 2]],
[[3, 4, 5]],
[[6, 7, 8]]])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.