简体   繁体   English

用二阶二次曲线平滑小数据集

[英]Smoothing Small Data Set With Second Order Quadratic Curve

I'm doing some specific signal analysis, and I am in need of a method that would smooth out a given bell-shaped distribution curve. 我正在做一些特定的信号分析,我需要一种方法来平滑给定的钟形分布曲线。 A running average approach isn't producing the results I desire. 平均运行方法不能产生我想要的结果。 I want to keep the min/max, and general shape of my fitted curve intact, but resolve the inconsistencies in sampling. 我想保持最小/最大,并保持拟合曲线的一般形状,但要解决采样中的不一致问题。

In short: if given a set of data that models a simple quadratic curve, what statistical smoothing method would you recommend? 简而言之:如果给出一组模拟简单二次曲线的数据,您会推荐哪种统计平滑方法?

If possible, please reference an implementation, library, or framework. 如果可能,请参考实现,库或框架。

Thanks SO! 谢谢!

Edit : Some helpful data 编辑 :一些有用的数据

(A possible signal graph) (可能的信号图)

替代文字

The dark colored quadratic is my "fitted" curve of the light colored connected data points. 深色二次曲线是浅色连接数据点的“拟合”曲线。

The sample @ -44 (approx.), is a problem in my graph (ie a potential sample inconsistency). 样本@ -44(大约),是我的图中的一个问题(即潜在的样本不一致)。 I need this curve to "fit" the distribution better, and overcome the values that do not trend accordingly. 我需要这条曲线更好地“适应”分布,并克服不相应趋势的值。 Hope this helps! 希望这可以帮助!

A "quadratic" curve is one thing; “二次”曲线是一回事; "bell-shaped" usually means a Gaussian normal distribution. “钟形”通常表示高斯正态分布。 Getting a best-estimate Gaussian couldn't be easier: you compute the sample mean and variance and your smooth approximation is 获得最佳估计高斯不容易:您计算样本均值和方差,您的平滑近似值

y = exp(-squared(x-mean)/variance)

If, on the other hand, you want to approximate a smooth curve with a quadradatic, I'd recommend computing a quadratic polynomial with minimum square error. 另一方面,如果想要用四边形近似平滑曲线,我建议计算具有最小平方误差的二次多项式。 I can nenver remember the formulas for this, but if you've had differential calculus, write the formula for the total square error (pointwise) and differentiate with respect to the coefficients of your quadratic. 我可以记住这个公式,但是如果你有微积分,则写出总平方误差的公式(逐点)并相对于二次系数进行微分。 Set the first derivatives to zero and solve for the best approximation. 将一阶导数设置为零并求解最佳近似值。 Or you could look it up. 或者你可以查一查。

Finally, if you just want a smooth-looking curve to approximate a set of points, cubic splines are your best bet. 最后,如果您只想要一条平滑的曲线来逼近一组点,则三次样条曲线是您最好的选择。 The curves won't necessarily mean anything, but you'll get a nice smooth approximation. 曲线不一定意味着什么,但你会得到一个很好的平滑近似。

#include <iostream>
#include <math.h>

struct WeightedData 
{
double x;
double y;
double weight;
};

void findQuadraticFactors(WeightedData *data, double &a, double &b, double &c, unsigned int const datasize)
{
double w1 = 0.0;
double wx = 0.0, wx2 = 0.0, wx3 = 0.0, wx4 = 0.0;
double wy = 0.0, wyx = 0.0, wyx2 = 0.0;
double tmpx, tmpy;
double den;

for (unsigned int i = 0; i < datasize; ++i) 
    {
    double x = data[i].x;
    double y = data[i].y;
    double w = data[i].weight;  

    w1 += w;
    tmpx = w * x;
    wx += tmpx;
    tmpx *= x;
    wx2 += tmpx;
    tmpx *= x;
    wx3 += tmpx;
    tmpx *= x;
    wx4 += tmpx;
    tmpy = w * y;
    wy += tmpy;
    tmpy *= x;
    wyx += tmpy;
    tmpy *= x;
    wyx2 += tmpy;
    }

den = wx2 * wx2 * wx2 - 2.0 * wx3 * wx2 * wx + wx4 * wx * wx + wx3 * wx3 * w1 - wx4 * wx2 * w1;
if (den == 0.0) 
    {
    a = 0.0;
    b = 0.0;
    c = 0.0;
    }
else    
    {
    a = (wx * wx * wyx2 - wx2 * w1 * wyx2 - wx2 * wx * wyx + wx3 * w1 * wyx + wx2 * wx2 * wy - wx3 * wx * wy) / den;
    b = (-wx2 * wx * wyx2 + wx3 * w1 * wyx2 + wx2 * wx2 * wyx - wx4 * w1 * wyx - wx3 * wx2 * wy + wx4 * wx * wy) / den;
    c = (wx2 * wx2 * wyx2 - wx3 * wx * wyx2 - wx3 * wx2 * wyx + wx4 * wx * wyx + wx3 * wx3 * wy - wx4 * wx2 * wy) / den;
    }

}

double findY(double const a, double const b, double const c, double const x)
{       
return a * x * x + b * x + c; 
};




int main(int argc, char* argv[])
{
WeightedData data[9];
data[0].weight=1; data[0].x=1; data[0].y=-52.0; 
data[1].weight=1; data[1].x=2; data[1].y=-48.0; 
data[2].weight=1; data[2].x=3; data[2].y=-43.0; 
data[3].weight=1; data[3].x=4; data[3].y=-44.0; 
data[4].weight=1; data[4].x=5; data[4].y=-35.0; 
data[5].weight=1; data[5].x=6; data[5].y=-31.0; 
data[6].weight=1; data[6].x=7; data[6].y=-32.0; 
data[7].weight=1; data[7].x=8; data[7].y=-43.0; 
data[8].weight=1; data[8].x=9; data[8].y=-52.0; 

double a=0.0, b=0.0, c=0.0;
findQuadraticFactors(data, a, b, c, 9);
std::cout << " x \t y" << std::endl;
for (int i=0; i<9; ++i)
    {
    std::cout << " " << data[i].x << ", " << findY(a,b,c,data[i].x) << std::endl;
    }
}

Perhaps the parameters for your running average are set wrong (sample window too small or large)? 也许您的运行平均值的参数设置错误(样本窗口太小或太大)?

Is it just noise superimposed on your bell curve? 只是噪音叠加在钟形曲线上吗? How close is the noise frequency to that of the signal you're trying to retrieve? 噪声频率与您尝试检索的信号的接近程度有多近? A picture of what you're trying to extract might help us identify a solution. 您尝试提取的内容可能有助于我们确定解决方案。

You could try some sort of fitting algorithm using a least squares fit if you can make a reasonable guess of the function parameters. 如果你可以合理地猜测函数参数,你可以尝试使用最小二乘拟合的某种拟合算法。 Those sorts of techniques often have some immunity to noise. 这些技术通常对噪音有一定的免疫力。

How about a simple digital low-pass filter ? 简单的数字低通滤波器怎么样?

y[0] = x[0];
for (i = 1; i < len; ++i)
    y[i] = a * x[i] + (1.0 - a) * y[i - 1];

In this case, x[] is your input data and y[] is the filtered output. 在这种情况下, x []是输入数据, y []是过滤后的输出。 The a coefficient is a value between 0 and 1 that you should tweak. a系数是一个介于0和1之间的值,你应该调整它。 An a value of 1 reproduces the input and the cut-off frequency decreases as a approaches 0. 的1的再现所输入和截止频率减小接近0。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM