[英]How can I vectorize the loops of this function in Octave?
I want to be able to vectorize the for-loops of this function to then be able to parallelize it in octave. 我希望能够矢量化此函数的for循环,然后能够以八度为单位对其进行并行化。 Can these for-loops be vectorized? 这些for循环可以矢量化吗? Thank you very much in advance! 提前非常感谢您!
I attach the code of the function commenting on the start and end of each for-loop and if-else. 我在每个for循环和if-else的开始和结束处附加函数注释的代码。
function [par]=pem_v(tsm,pr)
% tsm and pr are arrays of N by n. % par is an array of N by 8
tss=[27:0.5:32];
tc=[20:0.01:29];
N=size(tsm,1);
% main-loop
for ii=1:N
% I extract the rows in each loop because each one represents a sample
sst=tsm(ii,:); sst=sst'; %then I convert each sample to column vectors
pre=pr(ii,:); pre=pre';
% main-condition
if isnan(nanmean(sst))==1;
par(ii,1:8)=NaN;
else
% first sub-loop
for k=1:length(tss);
idxx=find(sst>=tss(k)-0.25 & sst<=tss(k)+0.25);
out(k)=prctile(pre(idxx),90);
end
% end first sub-loop
tp90=tss(find(max(out)==out));
% second sub-loop
for j=1:length(tc)
cond1=find(sst>=tc(j) & sst<=tp90);
cond2=find(sst>=tp90);
pem=zeros(length(sst),1);
A=[sst(cond1),ones(length(cond1),1)];
B=regress(pre(cond1),A);
pt90=B(1)*(tp90-tc(j));
AA=[(sst(cond2)-tp90)];
BB=regress(pre(cond2)-pt90,AA);
pem(cond1)=max(0,B(1)*(sst(cond1)-tc(j)));
pem(cond2)=max(0,(BB(1)*(sst(cond2)-tp90))+pt90);
clear A B AA BB;
E(j)=sqrt(nansum((pem-pre).^2)/length(pre));
clear pem;
end
% end second sub-loop
tcc=tc(find(E==min(E)));
% sub-condition
if(isempty(tcc)==1);
par(ii,1:9)=NaN;
else
cond1=find(sst>=tcc & sst<=tp90);
cond2=find(sst>=tp90);
pem=zeros(length(sst),1);
A=[sst(cond1),ones(length(cond1),1)];
B=regress(pre(cond1),A);
pt90=B(1)*(tp90-tcc);
AA=[sst(cond2)-tp90];
BB=regress(pre(cond2)-pt90,AA);
pem(cond1)=max(0,B(1)*(sst(cond1)-tcc));
pem(cond2)=max(0,(BB(1)*(sst(cond2)-tp90))+pt90);
RMSE=sqrt(nansum((pem-pre).^2)/length(pre));
% outputs
par(ii,1)=tcc;
par(ii,2)=tp90;
par(ii,3)=B(1);
par(ii,4)=BB(1);
par(ii,5)=RMSE;
par(ii,6)=nanmean(sst);
par(ii,7)=nanmean(pre);
par(ii,8)=nanmean(pem);
end
% end sub-condition
clear pem pre sst RMSE BB B tp90 tcc
end
% end main-condition
end
% end main-loop
You haven't given any example inputs, so I've created some like so: 您没有提供任何示例输入,因此我创建了一些类似的代码:
N = 5; n = 800;
tsm = rand(N,n)*5+27; pr = rand(N,n);
Then, before you even consider vectorising your code, you should keep 4 things in mind... 然后,在考虑对代码进行矢量化之前,应牢记四件事...
find
to convert logical indices into linear indices, there is no need and it will slow down your code 不要使用find
将逻辑索引转换为线性索引,这是不必要的, 它会减慢您的代码速度 clear
, especially many times within loops. 不要在循环内重复使用clear
,尤其是多次。 It is slow! 太慢了! Instead, use pre-allocation to ensure the variables are as you expect each loop. 而是使用预分配来确保变量与每个循环一样。 Using the above random inputs, and taking account of these 4 things, the below code is ~65% quicker than your code . 使用上面的随机输入,并考虑到这四件事, 下面的代码比您的代码快65% 。 Note: this is without even doing any vectorising! 注意:这甚至没有做任何向量化!
function [par]=pem_v(tsm,pr)
% tsm and pr are arrays of N by n.
% par is an array of N by 8
tss=[27:0.5:32];
tc=[20:0.01:29];
N=size(tsm,1);
% Transpose once here instead of every loop
tsm = tsm';
pr = pr';
% Pre-allocate memory for output 'par'
par = NaN(N, 8);
% Don't compute these every loop, do it before the loop.
% numel simpler than length for vectors, and size is clearer still
ntss = numel(tss);
nsst = size(tsm,1);
ntc = numel(tc);
npr = size(pr, 1);
for ii=1:N
% Extract the columns in each loop because each one represents a sample
sst=tsm(:,ii);
pre=pr(:,ii);
% main-condition. Previously isnan(nanmean(sst))==1, but that's only true if all(isnan(sst))
% We don't need to assign par(ii,1:8)=NaN since we initialised par to a matrix of NaNs
if ~all(isnan(sst));
% first sub-loop, initialise 'out' first
out = zeros(1, ntss);
for k=1:ntss;
% Don't use FIND on an indexing vector. Use the logical index raw, it's quicker
idxx = (sst>=tss(k)-0.25 & sst<=tss(k)+0.25);
% We need a check that some values of idxx are true, otherwise prctile will error.
if nnz(idxx) > 0
out(k) = prctile(pre(idxx), 90);
end
end
% Again, no need for FIND, just reduces speed. This is a theme...
tp90=tss(max(out)==out);
for jj=1:ntc
cond1 = (sst>=tc(jj) & sst<=tp90);
cond2 = (sst>=tp90);
% Use nnz (numer of non-zero) instead of length, since cond1 is now a logical vector of all elements
A = [sst(cond1),ones(nnz(cond1),1)];
B = regress(pre(cond1), A);
pt90 = B(1)*(tp90-tc(jj));
AA = [(sst(cond2)-tp90)];
BB = regress(pre(cond2)-pt90,AA);
pem=zeros(nsst,1);
pem(cond1) = max(0, B(1)*(sst(cond1)-tc(jj)));
pem(cond2) = max(0, (BB(1)*(sst(cond2)-tp90))+pt90);
E(jj) = sqrt(nansum((pem-pre).^2)/npr);
end
tcc = tc(E==min(E));
if ~isempty(tcc);
cond1 = (sst>=tcc & sst<=tp90);
cond2 = (sst>=tp90);
A = [sst(cond1),ones(nnz(cond1),1)];
B = regress(pre(cond1),A);
pt90 = B(1)*(tp90-tcc);
AA = [sst(cond2)-tp90];
BB = regress(pre(cond2)-pt90,AA);
pem = zeros(length(sst),1);
pem(cond1) = max(0, B(1)*(sst(cond1)-tcc));
pem(cond2) = max(0, (BB(1)*(sst(cond2)-tp90))+pt90);
RMSE = sqrt(nansum((pem-pre).^2)/npr);
% Outputs, which we might as well assign all at once!
par(ii,:)=[tcc, tp90, B(1), BB(1), RMSE, ...
nanmean(sst), nanmean(pre), nanmean(pem)];
end
end
end
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.