简体   繁体   中英

how to remove NaN from numpy subarray

I have following numpy array:

array([['0.0', '0.0'],
       ['3.0', '0.0'],
       ['3.5', '35000.0'],
       ['4.0', '70000.0'],
       ['4.2', 'nan'],
       ['4.5', '117000.0'],
       ['5.0', '165000.0'],
       ['5.2', 'nan'],
       ['5.5', '225000.0'],
       ['6.0', '285000.0'],
       ['6.2', 'nan'],
       ['6.5', '372000.0'],
       ['7.0', '459000.0'],
       ['7.5', '580000.0'],
       ['8.0', '701000.0'],
       ['8.1', 'nan'],
       ['8.5', '832000.0'],
       ['8.8', 'nan'],
       ['9.0', '964000.0'],
       ['9.5', '1127000.0'],
       ['33.0', 'nan'],
       ['35.0', 'nan']], dtype='<U12')

I want to drop all subarrays with nan values.

Desired output is:

array([['0.0', '0.0'],
       ['3.0', '0.0'],
       ['3.5', '35000.0'],
       ['4.0', '70000.0'],
       ['4.5', '117000.0'],
       ['5.0', '165000.0'],
       ['5.5', '225000.0'],
       ['6.0', '285000.0'],
       ['6.5', '372000.0'],
       ['7.0', '459000.0'],
       ['7.5', '580000.0'],
       ['8.0', '701000.0'],
       ['8.5', '832000.0'],
       ['9.0', '964000.0'],
       ['9.5', '1127000.0'], dtype='<U12')

I ended with trying with np.isnan(array) , but I got error ufunc 'isnan' not supported for the input types . One idea while writing this is to split array in two arrays and get nan indexes and apply filter on both arrays and merge back. Any help is appreciated.

TL;DR

a = a.astype(float); filtered = a[~np.isnan(a[:, 1])]

Assuming you want your numpy array as floats and not strings:

import numpy as np


# generate similar data
a = np.random.randint(low=0, high=20, size=(5, 2)).astype(str)
a[[0, 2, 3], 1] = 'nan'
print(a)
# [['15' 'nan']
#  ['17' '9']
#  ['15' 'nan']
#  ['5' 'nan']
#  ['14' '14']]

# convert to float first
a = a.astype(float)

# filter by np.nan
filtered = a[~np.isnan(a[:, 1])]

print(filtered)
# [[17.  9.]
#  [14. 14.]]

First for some reason, the provided array is an array of strings. So before proceeding further we need to convert it to an array of floats:

# assuming your original array is arr
new_arr = arr.astype(float)

Then, we can filter the list elements, in a way to only keep the subarrays which second element is not NaN

filtered_list = np.array(list(filter(lambda x: not np.isnan(x[1]), new_arr)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM