简体   繁体   中英

Stacking 2D numpy arrays to use nanmean

I have two arrays, and I'd like to take per-cell average of them, but taking into account NaNs.

My two arrays are:

In [267]: a = np.array([ [1, 2, np.nan], [np.nan, 5, 6], [np.nan, np.nan, np.nan]])

In [268]: a
Out[268]: 
array([[  1.,   2.,  nan],
       [ nan,   5.,   6.],
       [ nan,  nan,  nan]])

In [269]: b = np.array( [ [2, np.nan, 6], [8, np.nan, 12], [14, 16, np.nan]])

In [270]: b
Out[270]: 
array([[  2.,  nan,   6.],
       [  8.,  nan,  12.],
       [ 14.,  16.,  nan]])

If I didn't want to take into account NaNs then I could do:

In [271]: (a+b)/2
Out[271]: 
array([[ 1.5,  nan,  nan],
       [ nan,  nan,  9. ],
       [ nan,  nan,  nan]])

However, I need to do the mean calculation so that mean(2.5, nan) == 2.5 - and thus NaNs are ignored, unless I have two NaNs in which case mean(nan, nan) == nan .

Thus, the result I'd like to get is:

Out[271]: 
    array([[ 1.5,  2,  6],
           [ 8,  5,  9. ],
           [ 14,  16,  nan]])

The scipy.stats.nanmean seems to do this. However, to do this, I think I need to get the arrays stacked properly. I have two 3 x 3 arrays, and I think I need to create a 2 x 3 x 3 array - is that right? I can't seem to manage to stack these arrays to create a result with those dimensions - I've tried np.dstack as well as various other techniques, but nothing seems to work.

I suspect I'm doing something silly - any ideas as to how I can fix this?

You need to concatenate the arrays across a new axis (the third dimension - axis 2). You can then take the nanmean over this dimension.

In [1]: c = np.concatenate([a[..., None], b[..., None]], axis=2)
In [2]: scipy.stats.nanmean(c, axis=2)
Out[3]: 
array([[  1.5,   2. ,   6. ],
       [  8. ,   5. ,   9. ],
       [ 14. ,  16. ,   nan]])

I combined the arrays using np.array:

>>> c=np.array([a,b])
array([[[  1.,   2.,  nan],
        [ nan,   5.,   6.],
        [ nan,  nan,  nan]],

       [[  2.,  nan,   6.],
        [  8.,  nan,  12.],
        [ 14.,  16.,  nan]]])

>>> scipy.stats.nanmean(c,axis=0)
array([[  1.5,   2. ,   6. ],
       [  8. ,   5. ,   9. ],
       [ 14. ,  16. ,   nan]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM