简体   繁体   中英

Checking nan values in a numpy array

I've read some column from an excel file and stored that in a numpy array, col. For every index i in col I want to check if the value is nan, if it's nan I will delete the index i in col and in another array, x. I did this,

workbook = xlrd.open_workbook('well data.xlsx')
sheet=workbook.sheet_by_index(0)
col= sheet.col_values(1,1)
col= np.array (col)
col= col.astype(np.float)
        for i in range (col.shape [0]):
            if (np.isnan(col[i])):
                col=np.delete(col,i)
                x= np.delete(x,i)

I'm getting two types of errors, first when this float conversion exists col= col.astype(np.float), I get

    if (np.isnan(col[i])):
IndexError: index out of bounds

second, if I remove the float conversion, I get this error,

    if (np.isnan(col[i])):
TypeError: Not implemented for this type

I know for removing the nan from a single numpy array I can do this,

x = x[numpy.logical_not(numpy.isnan(x))]

But my case is different, I want to delete the nan elements from col, and any corresponding element in x. For example, if index 3 in col is nan, index 3 in col and x should be deleted. Also, float conversion is necessary in my case.

This is a more detailed example,

These are the initial arrays (both have similar length):

col= [16.5, 14.3, 17.42,nan, 13.22, nan]

x= [1, 2, 3, 4, 5, 6]

After removing nans the arrays should be,

col= [16.5, 14.3, 17.42, 13.22]

x= [1, 2, 3, 5]

One more thing, the provided code works very well if I'm reading the columns from a .dat file, does it really matter if I'm reading the columns from excel?

Can anyone please help me solving this problem?

Thanks.

Your first idea was correct.

col= col.astype(np.float)
for i in range (col.shape [0]):
    if (np.isnan(col[i])):
        col=np.delete(col,i)
        x= np.delete(x,i)

Is almost correct. Shape return the total length of your object, but you have to go from 0 to this length -1. So your for line would be like :

for i in range (0, col.shape [0]):

But since you are removing elements from the array, you may have a smaller array while computing this thing. So if you want to access the fifth and last element and you removed an element before, col will no longer have 5 elements. I suggest you loop backward on your coloumn, like this

for i in range(col.shape [0]-1, -1, -1):

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM