简体   繁体   English

在 scipy.signal 中使用 nan 值去趋势数据

[英]Detrending data with nan value in scipy.signal

I have a time series dataset with some nan values in it.我有一个时间序列数据集,其中包含一些 nan 值。 I want to detrend this data:我想去除这些数据的趋势:

I tried by doing this:我尝试这样做:

scipy.signal.detrend(y)

then I got this error:然后我收到了这个错误:

ValueError: array must not contain infs or NaNs

Then I tried with:然后我尝试:

scipy.signal.detrend(y.dropna())

But I lost data order.但是我丢失了数据顺序。

How to solve this porblem?如何解决这个问题?

For future reference there is a digital signal processing Stack site, https://dsp.stackexchange.com/ .为了将来参考,有一个数字信号处理堆栈站点, https://dsp.stackexchange.com/ I would suggest using that in the future for signal processing related questions.我建议将来在信号处理相关问题中使用它。


The easiest way I can think of is to manually detrend your data.我能想到的最简单的方法是手动去除数据趋势。 You can do this easily by computing least squares.您可以通过计算最小二乘法轻松完成此操作。 Least squares will take into account both your x and y values, so you can drop out the x values corresponding to where y = NaN .最小二乘法将同时考虑您的xy值,因此您可以删除与 where y = NaN对应的x值。

You can grab the indices of the non- NaN values with not_nan_ind = ~np.isnan(y) , and then do linear regression with the non- NaN values of y and the corresponding x values with, say, scipy.stats.linregress() :您可以使用not_nan_ind = ~np.isnan(y)获取非NaN值的索引,然后使用y的非NaN值和相应的x值进行线性回归,例如scipy.stats.linregress() :

m, b, r_val, p_val, std_err = stats.linregress(x[not_nan_ind],y[not_nan_ind])

Then you can simply subtract off this line from your data y to obtain the detrended data:然后您可以简单地从您的数据y减去这条线以获得去趋势数据:

detrend_y = y - (m*x + b)

And that's all you need.这就是你所需要的。 For example with some dummy data:例如一些虚拟数据:

import numpy as np
from matplotlib import pyplot as plt
from scipy import stats

# create data
x = np.linspace(0, 2*np.pi, 500)
y = np.random.normal(0.3*x, np.random.rand(len(x)))
drops = np.random.rand(len(x))
y[drops>.95] = np.NaN # add some random NaNs into y
plt.plot(x, y)

具有一些 NaN 值的数据

# find linear regression line, subtract off data to detrend
not_nan_ind = ~np.isnan(y)
m, b, r_val, p_val, std_err = stats.linregress(x[not_nan_ind],y[not_nan_ind])
detrend_y = y - (m*x + b)
plt.plot(x, detrend_y)

去趋势数据

只去除非 nan 部分但保留 nan 部分:

signal[np.logical_not(pd.isna(signal))] = scipy.signal.detrend(signal[np.logical_not(pd.isna(signal))])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM