简体   繁体   English

Matlab - 如何从一组二维点中删除异常值?

[英]Matlab - How to remove outliers from a set of 2D points?

My question has two parts:我的问题有两个部分:

  1. I have two 1D arrays containing X and Y values.我有两个包含 X 和 Y 值的一维数组。 How can I create another 1D array where each element is a 2D point?如何创建另一个一维数组,其中每个元素都是一个二维点?
  2. How to remove outliers from the resulting array?如何从结果数组中删除异常值?

For example, something like this:例如,这样的事情:

x = [1 3 2 4 2 3 400];
y = [2 3 1 4 2 1 500];
xy = [[1 2] [3 3] [2 1] [4 4] [2 2] [3 1] [400 500]];
result = rmoutliers(xy, 'mean');

The result should look like:结果应如下所示:

result = [[1 2] [3 3] [2 1] [4 4] [2 2] [3 1]]

My goal is to remove outlier points in a set of points like this (the points forming a line at the top):我的目标是删除一组像这样的点中的离群点(在顶部形成一条线的点):

在此处输入图片说明

First create an nx2 matrix.首先创建一个 nx2 矩阵。

x = [1 3 2 4 2 3 400]';
y = [2 3 1 4 2 1 500]';
xy = [x, y]

Now xy takes the following form:现在 xy 采用以下形式:

xy = 
     1     2
     3     3
     2     1
     4     4
     2     2
     3     1
   400   500

Now pass this matrix through rmoutliers:现在通过 rmoutliers 传递这个矩阵:

result = rmoutliers(xy);

The value of result should now be:结果的值现在应该是:

result =
     1     2
     3     3
     2     1
     4     4
     2     2
     3     1

As a note, there is no way to make a 1D array where each point has 2 dimensions because... well then you have a 2-dimensional array by definition.需要注意的是,没有办法制作一个一维数组,其中每个点都有二维,因为……好吧,根据定义,你有一个二维数组。 Keep things simple and just build a 2-dimensional matrix from the start!保持简单,从一开始就构建一个二维矩阵!

The function rmoutliers.m should resemble something like the following:函数rmoutliers.m应该类似于以下内容:

function [result] = rmoutliers(x, y, tol)
% rmoutliers: main function,
% removes outliers with absolute value > tol(a scalar)
% out of [x,y] series
dist = calcDist(x, y);
mean = calcMean(dist);
result = zeros(2,length(x));

for i = 1:length(dist)
    result(:,i) = [x(i), y(i)];
    if abs(dist(i) - mean) > tol
        result(:,i) = [-1, -1];
    end  
end

result(result == -1) = [];
result = reshape(result, 2, []);

end



function [dist] = calcDist(x, y)
%calcDist: calculates absolute value of
% each pair of elements in [x, y]
% (the distance from the origin)
dist = sqrt(x.^2 + y.^2);

end



function [mean] = calcMean(dist)
%calcMean: average of input array
mean = sum(dist) / length(dist);

end

All that goes in its own file rmoutliers.m in your Documents/MATLAB directory.所有这些都在你的Documents/MATLAB目录中它自己的文件rmoutliers.m中。 It should be evoked from the main Matlab prompt by typing:它应该通过键入以下内容从主 Matlab 提示中唤起:

x = [1 3 2 4 2 3 400];
y = [2 3 1 4 2 1 500];
result = rmoutliers(x, y, 100);

where 100 is just an example of the tolerance factor that will be used to determine the threshold of difference from the mean of an outlier.其中100只是容差因子的一个示例,用于确定与异常值均值的差异阈值。

EDIT: forgot to output members of result as pairs.编辑:忘记将结果成员成对输出。 You can use the cell structure for that.您可以为此使用cell结构。 After having run the program, type at the prompt:运行程序后,在提示符下键入:

C = cell(1,length(x));
for i = 1:length(x)
    C(i) = {result(1,i), result(2,i)};
end

% to read from cell structure:
D = cell2mat(C);
D = reshape(D,2, []);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM