简体   繁体   中英

Vectorizing code - How to reduce MATLAB computational time

I have this piece of code

N=10^4;
for i = 1:N
    [E,X,T] = fffun(); % Stochastic simulation. Returns every time three different vectors (whose length is 10^3).
    X_(i,:)=X;
    T_(i,:)=T;
    GRID=[GRID T];
end
GRID=unique(GRID);
% Second part
for i=1:N
for j=1:(kmax)
    f=find(GRID==T_(i,j) | GRID==T_(i,j+1));
    s=f(1);
    e=f(2)-1;

 counter(X_(i,j), s:e)=counter(X_(i,j), s:e)+1;
end
end

The code performs N different simulations of a stochastic process (which consists of 10^3 events, occurring at discrete moments (T vector) that depends on the specific simulation. Now (second part) I want to know, as a function of time istant, how many simulations are in a particular state (X assumes value between 1 and 10). The idea I had: create a grid vector with all the moments at which something happens in any simulation. Then, looping over the simulations, loop over the timesteps in which something happens and incrementing all the counter indeces that corresponds to this particular slice of time.

However this second part is very heavy (I mean days of processing on a standard quad-core CPU). And it shouldn't. Are there any ideas (maybe about comparing vectors in a more efficient way) to cut the CPU time?

This is a standalone 'second_part'

N=5000;
counter=zeros(11,length(GRID));

for i=1:N
    disp(['Counting sim #' num2str(i)]);
    for j=1:(kmax)
        f=find(GRID==T_(i,j) | GRID==T_(i,j+1),2);
        s=f(1);
        e=f(2)-1;

        counter(X_(i,j), s:e)=counter(X_(i,j), s:e)+1;

    end
end

counter=counter/N;
stop=find(GRID==Tmin);
stop=stop-1;
plot(counter(:,(stop-500):stop)')

with associated dummy data ( filedropper.com/data_38 ). In the real context the matrix has 2x rows and 10x columns.

Here is what I understand:

T_ is a matrix of time steps from N simulations.
X_ is a matrix of simulation state at T_ in those simulations.

so if you do:

[ut,~,ic]= unique(T_(:));

you get ic which is a vector of indices for all unique elements in T_ . Then you can write:

counter = accumarray([ic X_(:)],1);

and get counter with no. of rows as your unique timesteps, and no. of columns as the unique states in X_ (which are all, and must be, integers). Now you can say that for each timestep ut(k) the number of time that the simulation was in state m is counter(k,m) .

In your data, the only combination of m and k that has a value greater than 1 is (1,1) .


Edit:

From the comments below, I understand that you record all state changes, and the time steps when they occur. Then every time a simulation change a state you want to collect all the states from all simulations and count how many states are from each type.

The main problem here is that your time is continuous, so basically each element in T_ is unique , and you have over a million time steps to loop over. Fully vectorizing such a process will need about 80GB of memory which will probably stuck your computer.

So I looked for a combination of vectorizing and looping through the time steps. We start by finding all unique intervals, and preallocating counter :

ut = unique(T_(:));
stt = 11; % no. of states
counter = zeros(stt,numel(ut));r = 1:size(T_,1);
r = 1:size(T_,1); % we will need that also later

Then we loop over all element in ut , and each time look for the relevant timestep in T_ in all simulations in a vectorized way. And finally we use histcounts to count all the states:

for k = 1:numel(ut)
    temp = T_<=ut(k); % mark all time steps before ut(k)
    s = cumsum(temp,2); % count the columns
    col_ind = s(:,end); % fins the column index for each simulation
    % convert the coulmns to linear indices:
    linind = sub2ind(size(T_),r,col_ind.');
    % count the states:
    counter(:,k) = histcounts(X_(linind),1:stt+1);
end

This takes about 4 seconds at my computer for 1000 simulations, so it adds to a little more than one hour for the whole process. Not very quick...

You can try also one or two of the tweaks below to squeeze run time a little bit more:

  1. As you can read here , accumarray seems to work faster in small arrays then histcouns . So may want to switch to it.

  2. Also, computing linear indices directly is a quicker method than sub2ind , so you may want to try that.

implementing these suggestions in the loop above, we get:

R = size(T_,1);
r = (1:R).';
for k = 1:K
    temp = T_<=ut(k); % mark all time steps before ut(k)
    s = cumsum(temp,2); % count the columns
    col_ind = s(:,end); % fins the column index for each simulation
    % convert the coulmns to linear indices:
    linind = R*(col_ind-1)+r;
    % count the states:
    counter(:,k) = accumarray(X_(linind),1,[stt 1]);
end

In my computer switching to accumarray and or removing sub2ind gain a slight improvement but it was not consistent (using timeit for testing on 100 or 1K elements in ut ), so you better test it yourself. However, this still remains very long.


One thing that you may want to consider is trying to discretize your timesteps, so you will have much less unique elements to loop over. In your data about 8% of the time intervals a smaller than 1. If you can assume that this is short enough to be treated as one time step, then you could round your T_ and get only ~12.5K unique elements, which take about a minute to loop over. You can do the same for 0.1 intervals (which are less than 1% of the time intervals), and get 122K elements to loop over, what will take about 8 hours...

Of course, all the timing above are rough estimates using the same algorithm. If you do choose to round the times there may be even better ways to solve this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM