简体   繁体   English

如何在矩阵的每一行中找到多个唯一元素?

[英]How to find number of unique elements in each row of a matrix?

I'm trying to implement some sort of interpolation algorithm. 我正在尝试实现某种插值算法。 I is an N*4 matrix, holding indices of surrounding points of N other points. I是一个N * 4矩阵,其中包含N其他点的周围点的索引。 But elements in each row of I may not be unique, meaning two or more of them may refer to an individual point. 但是I每一行中的元素可能不是唯一的,这意味着它们中的两个或更多个可能指向单个点。 I want to know how many unique indices are available in each row, and I want to do it as fast as possible since N is big! 我想知道每行中有多少个唯一索引,并且由于N很大,所以我想尽快完成。

You need to use unique function on each row and count the elements of the result: 您需要在每一行上使用unique函数并计算结果的元素:

arrayfun(@(x) numel(unique(I(x,:))), (1:size(I,1)).')

The index array is transposed so that the result is a column vector. 对索引数组进行转置,以便结果是列向量。

Well, Mohsen's answer is a general solution for this problem, but arrayfun was too slow for me. 好吧, Mohsen的答案是解决该问题的一般方法,但是arrayfun对我来说太慢了。 So I thought a little more about it and found a much faster solution. 因此,我对此进行了思考,找到了一个更快的解决方案。 I compare all pairs of columns and increase a counter if they were equal: 我比较所有成对的列,如果相等则增加一个计数器:

tic;
S = zeros(N, 1, 'uint32');
Nu = S+4; % in my case most of point are surrounded by four different points
for i=1:3
    for j=(i+1):4
        S = S + uint32(I(:, i)==I(:, j));
    end
end
% Nu(S==0) = 4;
Nu(S==1) = 3;
Nu((S==2)|(S==3)) = 2; % why? :)
Nu(S==6) = 1;
toc;

For N=189225 , arrayfun takes 14.73s on my PC but summation takes only 0.04s. 对于N=189225arrayfun在我的PC上花费14.73s,但是总和仅花费0.04s。

Edit: Take care of different numbers of columns 编辑:照顾不同数量的列

Here's a modification of the code above. 这是上面代码的修改。 Now we can also have the places of unique values in each row! 现在,我们还可以在每一行中都有唯一值的位置! This one hasn't the :) problem and can be used for higher numbers of columns. 这个没有:)问题,可以用于更多的列。 Still taking 0.04s on my PC for 189225 rows. 在我的PC上,189225行仍然需要0.04s。

tic;
uniq = true(N, 4);
for i=1:3
    for j=(i+1):4
          uniq(I(:, i)==I(:, j), j) = false;
    end
end
Nu = sum(uniq, 2);
toc;

Edit(2): Comparison with EBH 's answer 编辑(2):与EBH答案进行比较

After a while I needed this for another problem where I wanted number of unique elements in each row of matrices with different numbers of columns. 一段时间后,我需要另一个问题,我想要在矩阵的每一行中具有不同列数的唯一元素数量。 So I compared my code with EBH's to see if their code is faster. 因此,我将我的代码与EBH进行了比较,以了解他们的代码是否更快。 I ran both codes on matrices with rows from 10K to 100K, and columns from 6 to 60. The results are average of spent time (in seconds) of 3 different runs: 我在行从10K到100K,列从6到60的矩阵上运行了这两个代码。结果是3次不同运行的平均花费时间(以秒为单位):

在此处输入图片说明

I'm testing this in 2016a and there has been a significant improvement in performance of for-loops in latest versions of MATLAB. 我在2016a中对此进行了测试,最新版本的MATLAB中for循环的性能有了显着提高。 So you may need to to compare it yourself if you want to run it in older versions. 因此,如果您想在旧版本中运行它,则可能需要自己进行比较。

Here is a super fast way to do that without loops: 这是一种无需循环的超快速方法:

accumarray(repmat(1:size(I,1),1,size(I,2)).',I(:),[],@(x) numel(unique(x)))

This will give you a vector size N, where the element in place k is the number of unique elements in I(k,:) . 这将为您提供向量大小N,其中位置k的元素是I(k,:)中唯一元素的数量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM