简体   繁体   中英

How to efficiently subtract each row from pandas dataframe?

df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), 
columns=list('ABCD'))

dfs = []

for index in range(len(df)):
    subtracted = df - df.loc[index]
    dfs.append(subtracted)

Is there a way to do this with apply perhaps? Doing it like above gets pretty slow for large dataframes...

IIUC:

Sample DF:

In [124]: df = pd.DataFrame(np.arange(9).reshape(3,3), columns=list('abc'))

In [125]: df
Out[125]:
   a  b  c
0  0  1  2
1  3  4  5
2  6  7  8

to get dfs :

In [126]: (df.values - df.values[:, None])
Out[126]:
array([[[ 0,  0,  0],
        [ 3,  3,  3],
        [ 6,  6,  6]],

       [[-3, -3, -3],
        [ 0,  0,  0],
        [ 3,  3,  3]],

       [[-6, -6, -6],
        [-3, -3, -3],
        [ 0,  0,  0]]])

to get subtracted :

In [127]: (df.values - df.values[:, None])[-1]
Out[127]:
array([[-6, -6, -6],
       [-3, -3, -3],
       [ 0,  0,  0]])

Some explanation:

df.values[:, None]

is a synonym for df.values[:, np.newaxis] :

In [132]: df.values[:, np.newaxis]
Out[132]:
array([[[0, 1, 2]],

       [[3, 4, 5]],

       [[6, 7, 8]]])  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM