如何加速這個已經矢量化的 matlab 代碼

Question

我正在嘗試加快以下代碼中的步驟 1-4（其余的設置將針對我的實際問題預先確定。）

% Given sizes:
m = 200;
n = 1e8;

% Given vectors:
value_vector = rand(m, 1);
index_vector = randi([0 200], n, 1);

% Objective: Determine the values for the values_grid based on indices provided by index_grid, which
%            correspond to the indices of the value in value_vector

% 0. Preallocate
values = zeros(n, 1);

% 1. Remove "0" indices since these won't have values assigned
nonzero_inds = (index_vector ~= 0);

% 2. Examine only nonzero indices
value_inds = index_vector(nonzero_inds);

% 3. Get the values for these indices
nonzero_values = value_vector(value_inds);

% 4. Assign values  to output (0 for those with 0 index)
values(nonzero_inds) = nonzero_values;

這是我對代碼的這些部分的分析：

必要的，因為index_vector將包含需要index_vector零。 O(n) 因為它只是一次遍歷向量一個元素並檢查（值∨ 0）的問題
應該是 O(n) 以通過index_vector並保留上一步中非零的那些
應該是 O(n)，因為我們必須檢查每個非零index_vector元素，並且對於每個元素，我們訪問value_vector是 O(1)。
應該是 O(n) 遍歷nonzero_inds每個元素，訪問相應的values索引，訪問相應的nonzero_values元素，並將其分配給values向量。

上面的代碼在 3.8GHz 的 4 核上運行第 1-4 步大約需要 5 秒。 大家對如何加快速度有任何想法嗎？ 謝謝。

Answer 1

哇，我發現了一件非常有趣的事情。 我在“相關”部分看到了這個鏈接，關於索引向量有時在 Matlab 中效率低下，所以我決定嘗試一個 for 循環。 這段代碼最終快了一個數量級！

for i = 1:n
    if index_vector(i) > 0
        values(i) = value_vector(index_vector(i));
    end
end

編輯：另一件有趣的事情，不幸的是對我的問題不利。 此解決方案的速度取決於 index_vector 中零的數量。 使用index_vector = randi([0 200]); ，一小部分值為零，但如果我嘗試index_vector = randi([0 1]) ，大約一半的值將為零，然后上述 for 循環實際上慢了一個數量級。 但是，使用~=而不是>加快循環速度，使其處於類似的數量級。 非常有趣和奇怪的行為。

Answer 2

如果你堅持使用 matlab 和你想要的算法流程，而不是在 fortran 或 c 中這樣做，這里有一個小小的開始：

將randi更改為 rand，並通過強制轉換為uint8舍入，並使用>邏輯運算，該運算由於某種原因在我結束時速度更快

總結：

value_vector = rand(m, 1 );
index_vector = uint8(-0.5+201*rand(n,1) );
values = zeros(n, 1);
values=value_vector(index_vector(index_vector>0));

這在我結束時提高了 1.6 倍

如何加速這個已經矢量化的 matlab 代碼

問題描述

2 個解決方案

解決方案1
1 2020-10-14 13:58:53

解決方案2
0 2020-10-13 23:48:15

如何加速這個已經矢量化的 matlab 代碼

問題描述

2 個解決方案

解決方案1 1 2020-10-14 13:58:53

解決方案2 0 2020-10-13 23:48:15

解決方案1
1 2020-10-14 13:58:53

解決方案2
0 2020-10-13 23:48:15