[英]How to substitute `find` commands with `logical indexing` (MATLAB), for looking up vector value positions of unique values?
In MATLAB, I have a for loop
which has a lot of interations to go through and fill a sparse
matrix. 在MATLAB中,我有一个
for loop
,它有很多内容要经过并填充sparse
矩阵。 The program is very slow and I would like to optimize it to see it finish some time soon. 该程序非常慢,我想优化它,看看它很快就会完成。 In two lines I use the command
find
, and the editor of MATLAB, warns me that the use of logical indexing
instead of find
will improve performace. 在两行中,我使用命令
find
和MATLAB的编辑器警告我,使用logical indexing
而不是find
将改善性能。 My code is quite similar to that presented to the mathworks newreader, mathworks newsreader recommendation , where there is a vector of values and a vector of unique value generated from it. 我的代码非常类似于mathworks newreader, mathworks新闻阅读器推荐 ,其中有一个值向量和一个由它生成的唯一值向量。 Uses
find
to obtain the index in the unique values (for updating the values in a matrix). 使用
find
获取唯一值中的索引(用于更新矩阵中的值)。 To be brief, the code given is: 简而言之,给出的代码是:
positions = find(X0_outputs == unique_outputs(j,1));
% should read
positions = X0_outputs == unique_outputs(j,1);
But the last line is not the index, but a vector of zeros and ones. 但最后一行不是索引,而是一个零和一的向量。 I have an illustrative example, make a set of indices;
我有一个说明性的例子,制作一组指数;
tt=round(rand(1,6)*10)
: tt=round(rand(1,6)*10)
:
tt = 3 7 1 7 1 7
Make a unique vector; 制作一个独特的矢量;
ttUNI=unique(tt)
ttUNI = 1 3 7
Use find to get the position index of the value in the set of unique values; 使用find获取唯一值集合中值的位置索引;
find(ttUNI(:) == tt(1))
ans = 2
Compare with using logical indexing; 与使用逻辑索引相比;
(ttUNI(:) == tt(1))
ans =
0
1
0
Having the value 2
is alot more useful than that binary vector when I need to update the indices for a matrix. 当我需要更新矩阵的索引时,值
2
比二元矢量更有用。 For my matrix, I can say mat(find(ttUNI(:) == tt(1)), 4)
and that works. 对于我的矩阵,我可以说
mat(find(ttUNI(:) == tt(1)), 4)
并且mat(find(ttUNI(:) == tt(1)), 4)
。 Whereas using (ttUNI(:) == tt(1))
needs post processing. 而使用
(ttUNI(:) == tt(1))
需要后期处理。
Is there a neat and efficient way of doing what is needed? 有没有一种简洁有效的方法来做所需的事情? Or is the use of
find
unavoidable in circumstances such as these? 或者在这些情况下使用
find
不可避免的?
UPDATE : I will include code here as recommended by user: @Jonas to give better insight into the problem which I am having and report some of the profiler tool's results. 更新 :我将根据用户的建议在此处包含代码:@Jonas,以便更好地了解我遇到的问题,并报告一些分析器工具的结果。
ALL_NODES = horzcat(network(:,1)',network(:,2)');
NUM_UNIQUE = unique(ALL_NODES);%unique and sorted
UNIQUE_LENGTH = length(NUM_UNIQUE);
TIME_MAX = max(network(:,3));
WEEK_NUM = floor((((TIME_MAX/60)/60)/24)/7);%divide seconds for minutes, for hours, for days and how many weeks
%initialize tensor of temporal networks
temp = length(NUM_UNIQUE);
%making the tensor a sparse 2D tensor!!! So each week is another replica of
%the matrix below
Atensor = sparse(length(NUM_UNIQUE)*WEEK_NUM,length(NUM_UNIQUE));
WEEK_SECONDS = 60*60*24*7;%number of seconds in a week
for ii=1:size(network,1)%go through all rows/observations
WEEK_NOW = floor(network(ii,3)/WEEK_SECONDS) + 1;
if(WEEK_NOW > WEEK_NUM)
disp('end of weeks')
break
end
data_node_i = network(ii,1);
Atensor_row_num = find(NUM_UNIQUE(:) == data_node_i)...
+ (WEEK_NOW-1)*UNIQUE_LENGTH;
data_node_j = network(ii,2);
Atensor_col_num = find(NUM_UNIQUE(:) == data_node_j);
%Atensor is sparse
Atensor(Atensor_row_num,Atensor_col_num) = 1;
end
Here UNIQUE_LENGTH = 223482
and size(network,1)=273209
. 这里
UNIQUE_LENGTH = 223482
, size(network,1)=273209
。 I rand the profiler tool
for a few minutes, which was not enough time needed for the program to finish, but to reach a steady state when the ratio of times would not change too much. 我使用了
profiler tool
几分钟,这不是程序完成所需的时间,而是在时间比例不会变化太大时达到稳定状态。 Atensor_row_num = find(NUM_UNI..
is 45.6% and Atensor_col_num = find(NUM_UNI...
is 43.4% . The line with Atensor(Atensor_row_num,Atenso...
which allocates values to the sparse
matrix, is only 8.9% . The length of the NUM_UNIQUE
vector is quite large, so find
is an important aspect of the code; even more important than the sparse matrix manipulation. Any improvement here would be significant. I don't know if there is a more efficient logical progression for this algorithm to proceed as well rather than taking the straightforward approach of replacing find
. Atensor_row_num = find(NUM_UNI..
是45.6% , Atensor_col_num = find(NUM_UNI...
是43.4% 。带有Atensor(Atensor_row_num,Atenso...
的行Atensor(Atensor_row_num,Atenso...
,它为sparse
矩阵分配值,仅为8.9% 。 NUM_UNIQUE
向量非常大,因此find
是代码的一个重要方面;甚至比稀疏矩阵操作更重要。这里的任何改进都很重要。我不知道这个算法是否有更有效的逻辑进展继续进行,而不是采取直接的方法来取代find
。
find
is indeed unavoidable under certain circumstances. 在某些情况下,
find
确实是不可避免的。 For example, if you want to loop over indices, ie 例如,如果要循环索引,即
idx = find(someCondition);
for i = idx(:)'
doSomething
end
or if you want to do multi-level indexing 或者如果你想做多级索引
A = [1:4,NaN,6:10];
goodA = find(isfinite(A));
everyOtherGoodEntry = A(goodA(1:2:end));
or if you want the first n good values 或者如果你想要前n个好的值
A = A(find(isfinite(A),n,'first');
In your case, you may be able to avoid the call to find
by using the additional outputs of unique
在您的情况下,您可以通过使用
unique
的附加输出来避免find
调用
[uniqueElements,indexIntoA,indexIntoUniqueElements] = unique(A);
Before you try to optimize your code by fixing what you think takes time, I suggest you run the profiler on your code to check what really takes time. 在您尝试通过修改您认为需要时间的代码来优化代码之前,我建议您在代码上运行探查器以检查实际需要的时间。 And then you can possibly post the code of your actual loop, and we may be able to help.
然后你可以发布实际循环的代码,我们也许可以提供帮助。
If you'd like to find the index of the true values in a logical vector, you can do the following: 如果您想在逻辑矢量中找到真值的索引,可以执行以下操作:
>> r = rand(1,5)
r =
0.5323 0.3401 0.4182 0.8411 0.2300
>> logical_val = r < 0.5 % Check whether values are less than 0.5
logical_val =
0 1 1 0 1
>> temp = 1:size(r,2) % Create a vector from 1 to the size of r
temp =
1 2 3 4 5
>> temp(logical_val) % Get the indexes of the true values
ans =
2 3 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.