Matlab：加速循環應用於820,000個元素

Question

我有一組降雨數據，多年來每15分鍾有一次值，給出了820,000行。 我的代碼的目的（最終是）是創建對數據進行分類的列，然后可以將其用於提取相關的數據塊以進行進一步的分析。

我是Matlab的新手，不勝感激！

第一步要足夠快。 但是，某些步驟非常慢。

我嘗試過預分配數組，並使用最低的intX（根據情況可能是8或16），但是其他步驟太慢了，無法完成。

較慢的是用於循環的，但是我不知道是否可以將它們矢量化/分割成塊/進行其他處理以加快循環速度。

我有一個變量“ rain”，其中每個步/行都包含一個值。 我創建了一個稱為“狀態”的變量，如果沒有雨，則為0，如果有雨，則為1。 還有一個稱為“ begin”的變量，如果它是風暴的第一行，則為1，否則為0。

第一個慢循環是創建“ spell”變量-為每次暴雨分配一個數字。

% Generate blank column for spell of size (rain) - preallocate
    spell = zeros(size(st),1,'int16');

% Start row for analysis
    x=1;

% Populate "spell" variable with a storm number in each row of rain, for the storm number it belongs to (storm number calculated by adding up the number of "begin" values up to that point

    for i=1:size(state)
         if(state(x)==1)
             spell(x) =  sum(begin(1:x));
         end
       x=x+1;
    end

下一階段是每場風暴的持續時間。 第一步足夠快。

 % List of storm numbers

     spellnum = unique(spell);

 % Length of each spell
     spelllength = histc(spell,spellnum);

下面的最后一步（for循環）太慢，只會崩潰。

 % Generate blank column for length

      length = zeros(size(state),1,'int16');

 % Starting row

      x = 1;

 % For loop to output the total length of the storm for each row of rain within that storm

     for i=1:size(state)

          for j=1:size(state)
                 position = find(spell==x);

                      for k=1:size(state)
                          length(position) = spelllength(x+1);
                      end
          end

       x=x+1;

      end

是否可以提高效率？

道歉，如果已經存在示例-我不確定該過程將被稱為什么！ 提前謝謝了。

Answer 1

嗯 分配/重新分配提示：

嘗試直接從表達式創建結果（最終修剪另一個更通用的結果）；
如果1.不可能，則嘗試盡可能地進行預分配（當結果有上限時）；
如果2.是不可能的話，請嘗試生長細胞陣列而不是大規模的矩陣（因為矩陣需要連續的存儲區）

選擇提示：

嘗試在中間結果中始終使用double ，因為這是MATLAB中的基本數值數據類型； 避免來回轉換；
僅在存在可以通過使用較小尺寸的類型緩解的內存約束時，才使用其他類型的中間結果。

線性化技巧：

最快的線性化使用結合了邏輯索引的矩陣方式或元素方式基本代數運算。
從MATLAB R2008開始，循環還不錯。
表現最差的元素方式處理函數是帶有匿名函數的arrayfun ， cellfun和structfun ，因為匿名函數評估的速度最慢；
盡量不要兩次計算相同的事物，即使這可以使您獲得更好的線性化效果。

第一塊：

% Just calculate the entire cumulative sum over begin, then
% trim the result. Check if the cumsum doesn't overflow.
spell           = cumsum(begin);
spell(state==0) = 0;

第二塊：

% The same, not sure how could you speed this up; changed
% the name of variables to my taste, though.
spell_num    = unique(spell);
spell_length = histc(spell,spell_num);

第三塊：

% Fix the following issues: 
%   - the most-inner "for" does not make sense because it rewrites
%     several times the same thing;
%   - the same looping variable "i" is re-used in three nested loops,
%   - thename of the standard function "length" is obscured by declaring
%     a variable named "length".
for x = 1:numel(spell_num)
        storm_selector = (spell==spell_num(x));
        storm_length(storm_selector) = spell_length(x+1);
end;

Answer 2

我最終使用的代碼組合是@CST_Link和@Sifu的混合體。 非常感謝您的幫助！ 我不認為Stackoverflow讓我接受兩個答案，因此為了清楚起見，將它們放在一起，下面是每個人都幫助我創建的代碼！

唯一慢的部分是第三塊中的for循環，但這仍然會在幾分鍾內運行，這對我來說已經足夠了，並且比我的嘗試好得多。

第一塊：

%% Spell
%spell is cumulative sum of begin

spell = cumsum(begin);

%% start row
x=1;

%% Replace all rows of spell with no rain with 0
spell(state==0)=0

第二個塊（除了更好的變量名外，其他保持不變）：

%%  Spell number = all values of spell

spell_num = unique(spell);

%% Spell length = how many of each value of spell
spell_length = histc(spell,spell_num);

第三塊：

%% Generate blank column for spell of size (state)
 spell_length2 = zeros(length(state),1);

%%
for x=1:length(state)
    position = find(spell==x);
    spell_length2(position) = spell_length(x+1);
end

Answer 3

對於第一部分，如果我關注您在做什么
我創建了一些符合您描述的數據進行測試。 請告訴我是否錯過了什么

state=[ 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 0];
begin=[ 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0]; 
spell = zeros(length(state),1,'int16');
%Start row for analysis
    x=1;

% Populate "spell" variable with a storm number in each row of rain, for the storm number it belongs to (storm number calculated by adding up the number of "begin" values up to that point

    for i=1:length(state)
         if(state(x)==1)
             spell(x) =  sum(begin(1:x));
         end
       x=x+1;
    end
% can be accomplished by simply using cumsum ( no need for extra variables if you are short in memory)


   spell2=cumsum(begin);
    spell3=spell2.*(state==1);

以及spell和spell3的輸出，如下所示

[spell.'; spell3]

 0      0      0      0      0      1      1      1      1      1      0      2      0      0      2      0      3      3    3      3      0
 0      0      0      0      0      1      1      1      1      1      0      2      0      0      2      0      3      3      3      3      0

Answer 4

您為什么不這樣做呢？

% For loop to output the total length of the storm for each row of rain within that storm

for x=1:size(state)
    position = find(spell==x);
    length(position) = spelllength(x+1);
end

我將i迭代器替換為x ，這刪除了2行和一些計算。
然后我繼續刪除了兩個嵌套循環，因為它們在垃圾方面毫無用處（每個循環都會輸出相同的內容）
那已經是一個好的開始。

Matlab：加速循環應用於820,000個元素

問題描述

4 個解決方案

解決方案1
1

解決方案2
1 已采納 2014-07-23 19:04:52

解決方案3
0 2014-07-23 15:26:14

解決方案4
0 2014-07-23 15:43:48

Matlab：加速循環應用於820,000個元素

問題描述

4 個解決方案

解決方案1 1

解決方案2 1 已采納 2014-07-23 19:04:52

解決方案3 0 2014-07-23 15:26:14

解決方案4 0 2014-07-23 15:43:48

解決方案1
1

解決方案2
1 已采納 2014-07-23 19:04:52

解決方案3
0 2014-07-23 15:26:14

解決方案4
0 2014-07-23 15:43:48