简体   繁体   English

在Matlab中查找绘图中的点

[英]Finding points along a plot in Matlab

I have the following plot and a file of the data which creates that plot. 我有以下图表和创建该图表的数据文件。 I would like to have Matlab find the following points for me: 我想让Matlab为我找到以下几点:

  1. [y,x] for peak noted by the 100% line [y,x]表示100%线所示的峰值
  2. [x] for where the plot crosses the y=0 line [x]表示曲线穿过y = 0线的位置
  3. [x] for where y is 50% and 20% of the peak found in part 1. [x]其中y为第1部分中50%和20%的峰值。

Are there any add-on tools or packages which people are aware of which can help me accomplish this? 是否有任何人们都知道的附加工具或软件包可以帮助我实现这一目标? I need to do this for a collection of plots so something reasonably automated would be ideal. 我需要为一组图表做这个,所以合理自动化的东西是理想的。

I can certainly do the programming and calculation parts in Matlab, it's just a matter of being able to load in the data file, matching it to a curve or function, and find the various [x,y] co-ordinates. 我当然可以在Matlab中编写编程和计算部分,这只是能够加载数据文件,将其与曲线或函数匹配,并找到各种[x,y]坐标。

数据图

OK, here goes. 好的,这就行了。 As far as I know, there is no baked-in routine in Matlab to do what you desire; 据我所知,Matlab中没有任何一个例行程序来做你想做的事情; you'll have to make one yourself. 你必须自己制作一个。 Couple of things to note: 有几点需要注意:

  • As should be obvious, linearly interpolated data is easiest to do and should pose no problem 显而易见,线性插值数据最容易做,应该没有问题

  • Using a single polynomial interpolation is also not too hard, although there's a few more details to take care of. 使用单个多项式插值也不是太难,尽管还有一些细节要处理。 Finding the peak should be first order of business, which involves finding roots of the derivative (using roots , for example). 寻找峰值应该是业务的第一顺序,其涉及找到衍生物的roots (例如,使用roots )。 When the peak is found, find the roots of the polynomial at all desired levels (0%, 20%, 50%) by offsetting the polynomial by this amount. 找到峰值后,通过将多项式偏移此量,找到所有所需级别(0%,20%,50%)的多项式的根。

  • Using cubic splines ( spline ) is most complicated of all. 使用三次样条( spline )是最复杂的。 The routine outlined above for general polynomials should be repeated for all subintervals where you have a complete cubic, taking into account the possibility that the maxima can also lie on the boundaries of the subinterval, and that any roots and extrema found may lie outside the interval on which the cubic is valid (also don't forget the x -offsets used by spline ). 对于具有完整立方体的所有子区间,应重复上面针对一般多项式概述的例程,同时考虑到最大值也可能位于子区间的边界的可能性,并且发现的任何根和极值可能位于区间之外立方体有效的地方(也不要忘记spline使用的x偏移)。

Here is my implementation of all 3 methods: 这是我对所有3种方法的实现:

%% Initialize
% ---------------------------

clc
clear all

% Create some bogus data
n = 25;

f = @(x) cos(x) .* sin(4*x/pi) + 0.5*rand(size(x));
x = sort( 2*pi * rand(n,1));
y = f(x);


%% Linear interpolation
% ---------------------------

% y peak
[y_peak, ind] = max(y);
x_peak = x(ind);

% y == 0%, 20%, 50%
lims = [0 20 50];
X = cell(size(lims));
Y = cell(size(lims));
for p = 1:numel(lims)

    % the current level line to solve for
    lim = y_peak*lims(p)/100;

    % points before and after passing through the current limit
    after    = (circshift(y<lim,1) & y>lim) | (circshift(y>lim,1) & y<lim);
    after(1) = false;
    before   = circshift(after,-1);

    xx = [x(before) x(after)];
    yy = [y(before) y(after)];

    % interpolate and insert new data
    new_X = x(before) - (y(before)-lim).*diff(xx,[],2)./diff(yy,[],2);
    X{p} = new_X;
    Y{p} = lim * ones(size(new_X));

end

% make a plot to verify
figure(1), clf, hold on
plot(x,y, 'r') % (this also plots the interpolation in this case)

plot(x_peak,y_peak, 'k.') % the peak

plot(X{1},Y{1}, 'r.') % the 0%  intersects
plot(X{2},Y{2}, 'g.') % the 20% intersects
plot(X{3},Y{3}, 'b.') % the 50% intersects

% finish plot
xlabel('X'), ylabel('Y'), title('Linear interpolation')
legend(...
    'Real data / interpolation',...
    'peak',...
    '0% intersects',...
    '20% intersects',...
    '50% intersects',...
    'location', 'southeast')



%% Cubic splines
% ---------------------------

% Find cubic splines interpolation
pp = spline(x,y);

% Finding the peak requires finding the maxima of all cubics in all
% intervals. This means evaluating the value of the interpolation on 
% the bounds of each interval, finding the roots of the derivative and
% evaluating the interpolation on those roots: 

coefs = pp.coefs;
derivCoefs = bsxfun(@times, [3 2 1], coefs(:,1:3));
LB = pp.breaks(1:end-1).'; % lower bounds of all intervals
UB = pp.breaks(2:end).';   % upper bounds of all intervals

% rename for clarity
a = derivCoefs(:,1);
b = derivCoefs(:,2);  
c = derivCoefs(:,3); 

% collect and limits x-data
x_extrema = [...
    LB, UB,...     
    LB + (-b + sqrt(b.*b - 4.*a.*c))./2./a,... % NOTE: data is offset by LB
    LB + (-b - sqrt(b.*b - 4.*a.*c))./2./a,... % NOTE: data is offset by LB
    ];

x_extrema = x_extrema(imag(x_extrema) == 0);
x_extrema = x_extrema( x_extrema >= min(x(:)) & x_extrema <= max(x(:)) );

% NOW find the peak
[y_peak, ind] = max(ppval(pp, x_extrema(:)));
x_peak = x_extrema(ind);

% y == 0%, 20% and 50%
lims = [0 20 50];
X = cell(size(lims));
Y = cell(size(lims));    
for p = 1:numel(lims)

    % the current level line to solve for
    lim = y_peak * lims(p)/100;

    % find all 3 roots of all cubics
    R = NaN(size(coefs,1), 3); 
    for ii = 1:size(coefs,1) 

        % offset coefficients to find the right intersects
        C = coefs(ii,:);
        C(end) = C(end)-lim;

        % NOTE: data is offset by LB
        Rr = roots(C) + LB(ii); 

        % prune roots
        Rr( imag(Rr)~=0 ) = NaN;
        Rr( Rr <= LB(ii) | Rr >= UB(ii) ) = NaN;
        % insert results
        R(ii,:) = Rr;
    end

    % now evaluate and save all valid points    
    X{p} = R(~isnan(R));
    Y{p} = ppval(pp, X{p});

end

% as a sanity check, plot everything 
xx = linspace(min(x(:)), max(x(:)), 20*numel(x));
yy = ppval(pp, xx);

figure(2), clf, hold on

plot(x,y, 'r') % the actual data
plot(xx,yy) % the cubic-splines interpolation 

plot(x_peak,y_peak, 'k.') % the peak

plot(X{1},Y{1}, 'r.') % the 0%  intersects
plot(X{2},Y{2}, 'g.') % the 20% intersects
plot(X{3},Y{3}, 'b.') % the 50% intersects

% finish plot
xlabel('X'), ylabel('Y'), title('Cubic splines interpolation')
legend(...
    'Real data',...
    'interpolation',...
    'peak',...
    '0% intersects',...
    '20% intersects',...
    '50% intersects',...
    'location', 'southeast')


%% (N-1)th degree polynomial
% ---------------------------

% Find best interpolating polynomial
coefs = bsxfun(@power, x, n-1:-1:0) \ y;
% (alternatively, you can use polyfit() to do this, but this is faster)

% To find the peak, we'll have to find the roots of the derivative: 
derivCoefs = (n-1:-1:1).' .* coefs(1:end-1);
Rderiv = roots(derivCoefs);
Rderiv = Rderiv(imag(Rderiv) == 0);
Rderiv = Rderiv(Rderiv >= min(x(:)) &  Rderiv <= max(x(:)));

[y_peak, ind] = max(polyval(coefs, Rderiv));
x_peak = Rderiv(ind);

% y == 0%, 20%, 50%
lims = [0 20 50];
X = cell(size(lims));
Y = cell(size(lims));
for p = 1:numel(lims)

    % the current level line to solve for
    lim = y_peak * lims(p)/100;

    % offset coefficients as to find the right intersects
    C = coefs;
    C(end) = C(end)-lim;

    % find and prune roots
    R = roots(C);
    R = R(imag(R) == 0);
    R = R(R>min(x(:)) & R<max(x(:)));

    % evaluate polynomial at these roots to get actual data
    X{p} = R;
    Y{p} = polyval(coefs, R);

end

% as a sanity check, plot everything 
xx = linspace(min(x(:)), max(x(:)), 20*numel(x));
yy = polyval(coefs, xx);

figure(3), clf, hold on

plot(x,y, 'r') % the actual data
plot(xx,yy) % the cubic-splines interpolation 

plot(x_peak,y_peak, 'k.') % the peak

plot(X{1},Y{1}, 'r.') % the 0%  intersects
plot(X{2},Y{2}, 'g.') % the 20% intersects
plot(X{3},Y{3}, 'b.') % the 50% intersects

% finish plot
xlabel('X'), ylabel('Y'), title('(N-1)th degree polynomial')
legend(...
    'Real data',...
    'interpolation',...
    'peak',...
    '0% intersects',...
    '20% intersects',...
    '50% intersects',...
    'location', 'southeast')

This results in these three plots: 这导致了这三个图:

线性插值多项式插值三次样条插值

(Note that something goes wrong in the (N-1)th degree polynomial; the 20% crossings are all wrong towards the end. So, before copy-pasting, check everything a bit more thoroughly :) (注意在(N-1)次多项式中出现了问题; 20%的交叉点在结束时都是错误的。因此,在复制粘贴之前,请更彻底地检查所有内容:)

As I said before, and as you can plainly see, interpolating with a single polynomial will often introduce a lot of problems if the underlying data is not suited for it. 正如我之前所说的,正如您可以清楚地看到的那样,如果底层数据不适合,使用单个多项式进行插值通常会引入很多问题。 Also, as you can clearly see from these plots, the interpolation method very strongly affects where the intersections will be -- it is of the utmost importance that you have at least some idea of what model underlies your data. 此外,正如您可以从这些图中清楚地看到的那样,插值方法会非常强烈地影响交叉点的位置 - 最重要的是您至少要了解哪些模型是您数据的基础。

For the general cases, cubic splines is usually the best way to go. 对于一般情况,三次样条通常是最好的方法。 However, this is a generic method and will give you (and the readers of your publication) a false perception of accuracy in your data. 但是,这是一种通用方法,它将为您(以及您的出版物的读者)提供对数据准确性的错误认知。 Use cubic splines to get a first idea of what the intersects are and how they behave, but always come back and revisit your analysis once the real model becomes more clear. 使用三次样条曲线可以首先了解相交是什么以及它们的行为方式,但是一旦真实模型变得更加清晰,就会回过头来重新审视分析。 Certainly don't publish with cubic splines when that is only used to create smoother, more "visually appealing" curves through your data :) 当然,当用于通过数据创建更平滑,更“视觉上吸引人”的曲线时,当然不发布三次样条曲线:)

This is not a complete answer, but Matlab has built-in functions that should accomplish most of what you want to do. 这不是一个完整的答案,但Matlab具有内置函数,可以完成您想要做的大部分工作。

  • max can help you find the 100% line max可以帮助您找到100%的产品线
  • polyfit will give you a polynomial fit to a set of points, in a least-squares sense. polyfit将以最小二乘意义为您提供一组多项式拟合。 If you want it to pass exactly through n points I believe you need to use at least degree n-1. 如果你想让它完全通过n个点,我相信你至少需要使用n-1度。
  • roots will give you the zero-crossings of the polynomial you just found. roots将为您提供刚刚找到的多项式的过零点。 You can also use it to find the 20% and 50% crossings by subtracting a constant. 您还可以通过减去常数来使用它来查找20%和50%的交叉点。 Where there are multiple crossings, you will want the ones closest to the maximum you found originally. 如果有多个交叉点,您将需要最接近您最初找到的最大交叉点的交叉点。 (Are you sure the crossings will always exist?) (你确定交叉点总是存在吗?)

To find global maximum use MAX function: 要找到全局最大使用MAX功能:

[ymax, imax] = max(y);
xmax = x(imax);
line(xlim,[ymax ymax],'Color','r')
line(xmax,ymax,'Color','r','LineStyle','o')

For the rest you can use great FileExchange submission - "Fast and Robust Curve Intersections" . 其余的你可以使用伟大的FileExchange提交 - “快速和稳健的曲线交叉点”

Line at y=0 can be define with xlim and yline0 = [0 0]; y = 0处的线可以用xlimyline0 = [0 0]; . Then you can do 那你可以做

[x0, y0] = intersections(x,y,xlim,yline0); % function from FileExchange
x0close(1) = xmax - min(xmax-x0(x0<xmax));
x0close(2) = xmax + min(x0(x0>xmax)-xmax);
y0close = y0(ismember(x0,x0close));
line(xlim,yline0,'Color','r')
line(x0close,y0close,'Color','r','LineStyle','o')

The same can be done for 20% and 50% except 除了20%和50%之外,同样可以做到

yline20 = repmat((ymax - y0(1))*0.2,1,2);

All of this assumes that you want intersections of strait lines as on your plot, not for interpolations. 所有这一切都假定你想要你的情节中的海峡线的交叉点,而不是插值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM