有没有办法忽略用于屏蔽单独数组的数组中的屏蔽值？

Question

My data is several arrays of data taken of the same length. 我的数据是取相同长度的几个数据数组。 I am masking one array (y) then using that masked array to mask a 2nd array (x). 我要遮罩一个数组（y），然后使用该遮罩的数组遮罩第二个数组（x）。 I mask x to get rid of values indicating equipment error (-9999). 我对x进行掩码以除去指示设备错误的值（-9999）。 I then use np.where() to find out where y is low (1 standard dev below the mean) to mask x in order to see the values of x when y is low. 然后，我使用np.where（）找出y较低的位置（均值以下1个标准dev）以掩盖x，以便在y较低时查看x的值。

I have tried changing my mask several times but none of the other numpy masked array operations gave me a different result. 我已经尝试过几次更改掩码，但是其他numpy掩码数组操作都没有给我带来不同的结果。 I tried to write a logical statement to give me the values when the mask = FALSE but I cannot do that within the np.where() statement. 我尝试编写一个逻辑语句，以在mask = FALSE时为我提供值，但是我无法在np.where（）语句中做到这一点。

x = np.array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] )
y = np.array( [ 0, 1, -9999, 3, 4, 5, 6, 7, 8, -9999, 10 ] )

x = np.ma.masked_values( x, -9999 )
y = np.ma.masked_values( y, -9999 )

low_y = ( y.mean() - np.std( y ) )

x_masked = x[ np.where( y < low_y ) ]

When we call x_masked, it returns: 当我们调用x_masked时，它返回：

>>>x_masked
masked_array(data=[0, 1, 2, 9],
         mask=False,
   fill_value=-9999)

We expect the mean of x_masked to be 0.5 ( (0 + 1)/2 ) but instead the mean is 3 because of the masked -9999 values ( 2 & 9) that were included in x_masked. 我们期望x_masked的平均值为0.5（（0 + 1）/ 2），但平均值为3，因为x_masked中包含了被屏蔽的-9999值（2和9）。

Is there a way to exclude the masked values in order to only get the unmasked values? 有没有一种方法可以排除屏蔽值，以便仅获取未屏蔽值？

Answer 1

I think you'd want to masked x where y != -9999 . 我认为您想在y != -9999地方屏蔽x 。 If you make this change to your code, it works as you expect. 如果对代码进行了更改，它将按预期工作。

You could also just use np.where to mask. 您也可以只使用np.where进行遮罩。

x = x[np.where(y != -9999)]
y = y[np.where(y != -9999)]

low_y = ( y.mean() - np.std( y ) )

x_masked = x[np.where( y < low_y)]

print (x_masked)
[0 1]

Answer 2

Since version 1.8 numpy added nanstd and nanmean to handle missing data. 从版本1.8开始，numpy添加了nanstd和nanmean来处理丢失的数据。 In your case since the -9999 is there to indicate error state and by definition I think it is a good use case of numpy.nan 在您的情况下，因为-9999在那里指示错误状态，根据定义，我认为这是numpy.nan

In [76]: y = np.where(y==-9999, np.nan, y)

In [77]: low_y = (np.nanmean(y) - np.nanstd(y))

In [78]: low_y
Out[78]: 1.8177166753143883

In [79]: x_masked = x[ np.where( y < low_y ) ]  # [0, 1]

有没有办法忽略用于屏蔽单独数组的数组中的屏蔽值？

问题描述

2 个解决方案

解决方案1
1 2019-09-11 18:07:29

解决方案2
1 已采纳 2019-09-11 23:35:55

有没有办法忽略用于屏蔽单独数组的数组中的屏蔽值？

问题描述

2 个解决方案

解决方案1 1 2019-09-11 18:07:29

解决方案2 1 已采纳 2019-09-11 23:35:55

解决方案1
1 2019-09-11 18:07:29

解决方案2
1 已采纳 2019-09-11 23:35:55