简体   繁体   English

在Python / Numpy中包含NAN的数组的线性回归

[英]Linear regression of arrays containing NANs in Python/Numpy

I have two arrays, say varx and vary. 我有两个数组,比如varx和vary。 Both contain NAN values at various positions. 两者都包含不同位置的NAN值。 However, I would like to do a linear regression on both to show how much the two arrays correlate. 但是,我想对两者进行线性回归,以显示两个数组的相关程度。 This was very helpful so far: http://glowingpython.blogspot.de/2012/03/linear-regression-with-numpy.html 到目前为止,这非常有用: http//glowingpython.blogspot.de/2012/03/linear-regression-with-numpy.html

However, using this: 但是,使用这个:

slope, intercept, r_value, p_value, std_err = stats.linregress(varx, vary)

results in nans for every output variable. 导致每个输出变量的nans。 What is the most convenient way to take only valid values from both arrays as input to the linear regression? 将两个数组中的有效值作为线性回归的输入的最方便的方法是什么? I heard about masking arrays, but am not sure how it works exactly. 我听说过屏蔽数组,但我不确定它是如何工作的。

You can remove NaNs using a mask: 您可以使用掩码删除NaN:

mask = ~np.isnan(varx) & ~np.isnan(vary)
slope, intercept, r_value, p_value, std_err = stats.linregress(varx[mask], vary[mask])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM