简体   繁体   中英

pandas dataframe to numpy array without loop

I have a dataframe and a 2d matrix.

The dataframe has 3 columns:

  • row index of the matrix
  • column index of the matrix
  • value for the matrix

I need to place these values into the matrix based on row and column indices. Currently, I am doing it by looping over the whole dataframe:

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

h,w=25,30 # dimensions of the matrix

df=pd.DataFrame(
    {
     'row':[1,5,2,4,6,7,15,20],
     'col':[3,15,22,29,16,12,25,1],
     'val':[444,2313,100,21,159,4102,225,2221]
     }
    )

# now I'm filling the matrix with values with for loop...

mat=np.zeros((h,w))
for i in range(df.shape[0]):
    row,col,val=df.loc[i,['row','col','val']]
    mat[row,col]=val
    
plt.imshow(mat)

Is there a way to do it without a loop?

With numpy indexing:

mat = np.zeros((h, w))
mat[df['row'], df['col']] = df['val']

阴谋


Sanity Check:

mat = np.zeros((h, w))
mat[df['row'], df['col']] = df['val']

mat2 = np.zeros((h, w))
for i in range(df.shape[0]):
    row, col, val = df.loc[i, ['row', 'col', 'val']]
    mat2[row, col] = val

print((mat == mat2).all())  # True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM