简体   繁体   English

如何从大型矩阵中获取所有满足某些条件的行?

[英]How do I get all the rows meeting some criteria out of a big matrix?

I have a big matrix (on the order of 1GB) where each row represents the (x,y) coordinate of a point I sampled a surface at, and its z-height. 我有一个大矩阵(大约1GB),其中每一行代表我在其上采样表面的点的(x,y)坐标及其z高度。

How do I get all the points less than some euclidean distance away from the (x,y) coordinate? 如何获得所有距(x,y)坐标小于某个欧几里德距离的点?

Currently I am doing this awful thing: 目前,我正在做这件事:


% mtx_pointList is a large matrix with each row containing a sample: [x y z]. 
% We want to get all the samples whose (x,y) point is less than dDistance 
%   away from the vector: v_center = [x y].

mtx_regionSamples = zeros(0,3); for k=1:length(mtx_pointList(:,1)) if( norm( mtx_pointList(k,[1 2])-v_center ) < dDistance^2 ) mtx_regionSamples = [ mtx_regionSamples mtx_pointList(k,:) ] end end

...but in my application this loop would have to be run around 250k times. ...但是在我的应用程序中,此循环必须运行约25万次。

How do I make it do the same thing faster? 如何使它更快地执行相同的操作?

Use pdist2 (its default option is Euclidean distance): 使用pdist2 (其默认选项为欧氏距离):

ind = pdist2(mtx_pointList(:,[1 2]), v_center) < dDistance; %// logical index
result = mtx_pointList(ind,:);

If the matrix is too large, divide it into chunks of as many rows as your memory allows, and loop over the chunks. 如果矩阵太大,则将其分成内存允许的尽可能多的行块,然后循环遍历这些块。

bsxfun bsxfun

If you don't have pdist2 (statistics toolbox), here's one way to compute distances with bsxfun : 如果您没有pdist2 (统计工具箱),则这是使用bsxfun计算距离的一种方法:

da = bsxfun(@minus,mtx_pointList(:,[1 2]),permute(v_center,[3 2 1]));
distances = sqrt(sum(da.^2,2));

Then find point that meet your criteria: 然后找到符合您条件的点:

distThresh = 0.5; % for example
indsClose = distances < distThresh
result = mtx_pointList(indsClose,:);

alternative 另类

You can also use an alternate form of Euclidean (2-norm) distance, 您还可以使用欧几里德距离(2-范数)的另一种形式,

||A-B|| = sqrt ( ||A||^2 + ||B||^2 - 2*A.B )

In MATLAB code: 在MATLAB代码中:

a = mtx_pointList(:,[1 2]); b = v_center;
aa = dot(a,a,2); bb = dot(b,b,2); ab=a*b.'; %' or sum(a.*a,2)
distances = sqrt(aa + bb - 2*ab); % bsxfun needed if b is more than one point

As Luis Mendo points out, the sqrt is not necessary if you threshold against distThresh^2 . 正如Luis Mendo指出的那样,如果对distThresh^2阈值,则不需要sqrt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM