"将半高斯滤波器应用于python中的分箱时间序列数据"

Question

I am binning some time series data, I need to apply a half-normal filter to the binned data.我正在对一些时间序列数据进行分箱，我需要对分箱数据应用半正态过滤器。 How can I do this in python?我怎样才能在python中做到这一点？ I've provided a toy example bellow.我在下面提供了一个玩具示例。 I need Xbinned to be smoothed with a half-gaussian filter with std of 0.25 (or what ever).我需要使用标准为 0.25（或其他任何值）的半高斯滤波器对 Xbinned 进行平滑处理。 I'm pretty sure the half gaussian should be facing the forward time direction.我很确定半高斯应该面向正向时间方向。

import numpy as np

X = np.random.randint(2, size=100) #example random process

bin_size =  5

Xbinned = []

for i in range(0, len(X)+1, bin_size):
    Xbinned.append(sum(X[i:i+(bin_size-1)])/bin_size)

Answer 1

How to implement half-gaussian filtering如何实现半高斯滤波<\/h2>

Scipy has a function called scipy.ndimage.gaussian_filter()<\/a> . Scipy 有一个名为scipy.ndimage.gaussian_filter()<\/a>的函数。 It nearly implements what we want here.它几乎实现了我们在这里想要的。 Unfortunately, there's no option to use a half-gaussian instead of a gaussian.不幸的是，没有选择使用半高斯而不是高斯。 However, scipy is open-source, so we can just take the source code<\/a> and modify it to be a half-gaussian.但是，scipy 是开源的，所以我们可以直接获取源代码<\/a>并将其修改为半高斯。

I used this source code, and removed all of the parts that are not needed for this particular case.我使用了这个源代码，并删除了这个特殊情况不需要的所有部分。 At the end, I had this:最后，我有这个：

 import scipy.ndimage def halfgaussian_kernel1d(sigma, radius): """ Computes a 1-D Half-Gaussian convolution kernel. """ sigma2 = sigma * sigma x = np.arange(0, radius+1) phi_x = np.exp(-0.5 \/ sigma2 * x ** 2) phi_x = phi_x \/ phi_x.sum() return phi_x def halfgaussian_filter1d(input, sigma, axis=-1, output=None, mode="constant", cval=0.0, truncate=4.0): """ Convolves a 1-D Half-Gaussian convolution kernel. """ sd = float(sigma) # make the radius of the filter equal to truncate standard deviations lw = int(truncate * sd + 0.5) weights = halfgaussian_kernel1d(sigma, lw) origin = -lw \/\/ 2 return scipy.ndimage.convolve1d(input, weights, axis, output, mode, cval, origin)<\/code><\/pre> A short summary of how this works:这是如何工作的简短摘要：
 
      
         First, it generates a convolution kernel.首先，它生成一个卷积核。 It uses the formula e^(-1\/2 * (x\/sigma)^2)<\/code> to generate the gaussian distribution.它使用公式e^(-1\/2 * (x\/sigma)^2)<\/code>来生成高斯分布。 It keeps going until you're 4 standard deviations away from the center.它一直持续到距离中心 4 个标准差为止。<\/li>
  Next, it convolves that kernel against your signal.接下来，它将内核与您的信号进行卷积。 It adjusts the kernel to start at the current timestep instead of being centered on the current timestep.它将内核调整为从当前时间步开始，而不是以当前时间步为中心。<\/li><\/ol> Trying this on your signal, I get a result like this:在你的信号上尝试这个，我得到这样的结果： array([0.59979879, 0.6 , 0.40006707, 0.59993293, 0.79993293, 0.40013414, 0.20006707, 0.59986586, 0.40006707, 0.4 , 0.99979879, 0.00033535, 0.59979879, 0.40006707, 0.00013414, 0.59979879, 0.20013414, 0.00006707, 0.19993293, 0.59986586])<\/code><\/pre> Choice of standard deviation标准差的选择<\/h2>If you pick a standard deviation of 0.25, that is going to have almost no effect on your signal.如果您选择 0.25 的标准偏差，那对您的信号几乎没有影响。 Here are the convolution weights it uses: [0.99966465 0.00033535]<\/code> .以下是它使用的卷积权重： [0.99966465 0.00033535]<\/code> 。 In other words, this has less than a 0.1% effect on the signal.换句话说，这对信号的影响不到 0.1%。
 I'd recommend using a larger sigma value.我建议使用更大的 sigma 值。
 Off by one error因一个错误而关闭<\/h2>Also, I want to point out the off-by-one error here:另外，我想在这里指出一个错误： for i in range(0, len(X)+1, bin_size): Xbinned.append(sum(X[i:i+(bin_size-1)])\/bin_size)<\/code><\/pre> Numpy ranges are not inclusive, so a range of i<\/code> to i+(bin_size-1)<\/code> actually captures 4 elements, not 5. Numpy 范围不包含在内，因此i<\/code>到i+(bin_size-1)<\/code>的范围实际上捕获 4 个元素，而不是 5 个。
 To fix this, you can change it to this:要解决此问题，您可以将其更改为：
 for i in range(0, len(X), bin_size): Xbinned.append(X[i:i+bin_size].mean())<\/code><\/pre> (Also, I fixed an off-by-one error in the loop specification and used a numpy shortcut for finding the mean.) （另外，我修复了循环规范中的一个错误，并使用了一个 numpy 快捷方式来查找平均值。）
"

"将半高斯滤波器应用于python中的分箱时间序列数据"

问题描述

1 个解决方案

解决方案1
0 2022-02-06 03:03:39

"将半高斯滤波器应用于python中的分箱时间序列数据"

问题描述

1 个解决方案

解决方案1 0 2022-02-06 03:03:39

解决方案1
0 2022-02-06 03:03:39