在Python / Numpy中包含NAN的数组的线性回归

Question

I have two arrays, say varx and vary. 我有两个数组，比如varx和vary。 Both contain NAN values at various positions. 两者都包含不同位置的NAN值。 However, I would like to do a linear regression on both to show how much the two arrays correlate. 但是，我想对两者进行线性回归，以显示两个数组的相关程度。 This was very helpful so far: http://glowingpython.blogspot.de/2012/03/linear-regression-with-numpy.html 到目前为止，这非常有用： http ： //glowingpython.blogspot.de/2012/03/linear-regression-with-numpy.html

However, using this: 但是，使用这个：

slope, intercept, r_value, p_value, std_err = stats.linregress(varx, vary)

results in nans for every output variable. 导致每个输出变量的nans。 What is the most convenient way to take only valid values from both arrays as input to the linear regression? 将两个数组中的有效值作为线性回归的输入的最方便的方法是什么？ I heard about masking arrays, but am not sure how it works exactly. 我听说过屏蔽数组，但我不确定它是如何工作的。

Answer 1

You can remove NaNs using a mask: 您可以使用掩码删除NaN：

mask = ~np.isnan(varx) & ~np.isnan(vary)
slope, intercept, r_value, p_value, std_err = stats.linregress(varx[mask], vary[mask])

在Python / Numpy中包含NAN的数组的线性回归

问题描述

1 个解决方案

解决方案1
19 已采纳 2012-11-30 10:34:27

在Python / Numpy中包含NAN的数组的线性回归

问题描述

1 个解决方案

解决方案1 19 已采纳 2012-11-30 10:34:27

解决方案1
19 已采纳 2012-11-30 10:34:27