简体   繁体   English

如何使用 Matlab 找到噪声数据序列的拐点?

[英]How to find inflection point of noisy data series using Matlab?

Given the following curve给定以下曲线

在此处输入图像描述

I'd like to determine the index of the x-data point where the curve begins to increase in earnest (in this example that would be around x=15).我想确定曲线开始认真增加的 x 数据点的索引(在本例中,x=15 左右)。

While I understand derivatives can be used for determining inflection points, note that the data is noisy and I'm unsure that approach would allow me to clearly identify the "true inflection point" (x=15 in this case).虽然我知道导数可用于确定拐点,但请注意数据是嘈杂的,我不确定这种方法是否能让我清楚地识别“真正的拐点”(在这种情况下 x=15)。

I'm wondering if a simpler approach would be feasible such as我想知道更简单的方法是否可行,例如

  • finding 4 datapoints where x1 < x2 < x3 < x4找到 4 个数据点,其中 x1 < x2 < x3 < x4
  • returning index of x1 x1 的返回索引

Do you have any suggestions on how to accomplish this?您对如何实现这一点有什么建议吗?

Sample data from above curve上面曲线的样本数据

 index       SQMean   
_____    ____________

'0'      '139.428574'
'1'      '133.298706'
'2'      '135.961044'
'3'      '143.688309'
'4'      '133.298706'
'5'      '133.181824'
'6'      '134.896103'
'7'      '146.415588'
'8'      '142.324677'
'9'      '128.168839'
'10'     '146.116882'
'11'     '146.766235'
'12'     '134.675323'
'13'     '138.610382'
'14'     '140.558441'
'15'     '128.662338'
'16'     '138.480515'
'17'     '153.610382'
'18'     '156.207794'
'19'     '183.428574'
'20'     '220.324677'
'21'     '224.324677'
'22'     '230.415588'
'23'     '226.766235'
'24'     '223.935059'
'25'     '229.922073'
'26'     '234.389618'
'27'     '235.493500'
'28'     '225.727280'
'29'     '241.623383'
'30'     '225.805191'
'31'     '240.896103'
'32'     '224.090912'
'33'     '230.467529'
'34'     '248.285721'
'35'     '233.779221'
'36'     '225.532471'
'37'     '247.337662'
'38'     '233.000000'
'39'     '241.740265'
'40'     '235.688309'
'41'     '238.662338'
'42'     '236.636368'
'43'     '236.025970'
'44'     '234.818176'
'45'     '240.974030'
'46'     '251.350647'
'47'     '241.857147'
'48'     '242.623383'
'49'     '245.714279'
'50'     '250.701294'
'51'     '229.415588'
'52'     '236.909088'
'53'     '243.779221'
'54'     '244.532471'
'55'     '241.493500'
'56'     '245.480515'
'57'     '244.324677'
'58'     '244.025970'
'59'     '231.987015'
'60'     '238.740265'
'61'     '239.532471'
'62'     '232.363632'
'63'     '242.454544'
'64'     '243.831161'
'65'     '229.688309'
'66'     '239.493500'
'67'     '247.324677'
'68'     '245.324677'
'69'     '244.662338'
'70'     '238.610382'
'71'     '243.324677'
'72'     '234.584412'
'73'     '235.181824'
'74'     '228.974030'
'75'     '228.246750'
'76'     '230.519485'
'77'     '231.441559'
'78'     '236.324677'
'79'     '229.935059'
'80'     '238.701294'
'81'     '236.441559'
'82'     '244.350647'
'83'     '233.714279'
'84'     '243.753250'

If this is a one-off estimate, one thing you can do is to use the curve fitting tool from the Curve Fitting Toolbox.如果这是一次性估计,您可以做的一件事是使用曲线拟合工具箱中的曲线拟合工具。 Here is an example where I fitted a piecewise linear function to your data:这是一个示例,其中我将分段线性 function 拟合到您的数据中:

(click on image for full size) (点击图片为全尺寸)

The function has the form function 具有以下形式

a * (x < b) + c * (x > d) + ((x - b) / (d - b) * (c - a) + a) * (x >= b) * (x <= d)

which says: there is a constant part for x < b with value a , another constant part for x > d with value c , and a linear ramp connecting them.其中表示: x < b有一个值为a常数部分, x > d的另一个常数部分值为c ,以及连接它们的线性斜坡。

It is hard to fit such a function, and will only work well if you provide decent starting estimates (see small window in screenshot).很难适应这样的 function,并且只有在您提供不错的起始估计时才能很好地工作(请参见屏幕截图中的小 window)。 It is therefore not a way to automate the process, but just to obtain improved estimates.因此,这不是一种使过程自动化的方法,而只是为了获得改进的估计。

In this case, from a starting estimate of b = 15 the fit provides an improved estimate b = 16.58 with a 95%-CI of [15.96, 17.2] , which indicates that indices 0 through 16 belong to the initial constant part.在这种情况下,从b = 15的起始估计开始,拟合提供了改进的估计b = 16.58 ,其 95%-CI 为[15.96, 17.2] ,这表明索引 0 到 16 属于初始常数部分。

The curve fitting tool can also generate code from your GUI specifications.曲线拟合工具还可以根据您的 GUI 规范生成代码。 In this case the result is:在这种情况下,结果是:

[xData, yData] = prepareCurveData( index, SQMean );

% Set up fittype and options.
ft = fittype( 'a * (x < b) + c * (x > d) + ((x - b) / (d - b) * (c - a) + a) * (x >= b) * (x <= d)', 'independent', 'x', 'dependent', 'y' );
opts = fitoptions( ft );
opts.Display = 'Off';
opts.Lower = [-Inf -Inf -Inf -Inf];
opts.StartPoint = [140 15 230 20];
opts.Upper = [Inf Inf Inf Inf];

% Fit model to data.
[fitresult, gof] = fit( xData, yData, ft, opts );

% Plot fit with data.
figure( 'Name', 'untitled fit 1' );
h = plot( fitresult, xData, yData );
legend( h, 'SQMean vs. index', 'untitled fit 1', 'Location', 'NorthEast' );
% Label axes
xlabel( 'index' );
ylabel( 'SQMean' );
grid on

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM