如何加速这个已经矢量化的 matlab 代码

Question

I'm trying to speed up steps 1-4 in the following code (the rest is setup that will be predetermined for my actual problem.)我正在尝试加快以下代码中的步骤 1-4（其余的设置将针对我的实际问题预先确定。）

% Given sizes:
m = 200;
n = 1e8;

% Given vectors:
value_vector = rand(m, 1);
index_vector = randi([0 200], n, 1);

% Objective: Determine the values for the values_grid based on indices provided by index_grid, which
%            correspond to the indices of the value in value_vector

% 0. Preallocate
values = zeros(n, 1);

% 1. Remove "0" indices since these won't have values assigned
nonzero_inds = (index_vector ~= 0);

% 2. Examine only nonzero indices
value_inds = index_vector(nonzero_inds);

% 3. Get the values for these indices
nonzero_values = value_vector(value_inds);

% 4. Assign values  to output (0 for those with 0 index)
values(nonzero_inds) = nonzero_values;

Here's my analysis of these portions of the code:这是我对代码的这些部分的分析：

Necessary since the index_vector will contain zeros which need to be ferreted out.必要的，因为index_vector将包含需要index_vector零。 O(n) since it's just a matter of going through the vector one element at a time and checking (value ∨ 0) O(n) 因为它只是一次遍历向量一个元素并检查（值∨ 0）的问题
Should be O(n) to go through index_vector and retain those that are nonzero from the previous step应该是 O(n) 以通过index_vector并保留上一步中非零的那些
Should be O(n) since we have to check each nonzero index_vector element, and for each element we access the value_vector which is O(1).应该是 O(n)，因为我们必须检查每个非零index_vector元素，并且对于每个元素，我们访问value_vector是 O(1)。
Should be O(n) to go through each element of nonzero_inds , access corresponding values index, access the corresponding nonzero_values element, and assign it to the values vector.应该是 O(n) 遍历nonzero_inds每个元素，访问相应的values索引，访问相应的nonzero_values元素，并将其分配给values向量。

The code above takes about 5 seconds to run through steps 1-4 on 4 cores, 3.8GHz.上面的代码在 3.8GHz 的 4 核上运行第 1-4 步大约需要 5 秒。 Do you all have any ideas on how this could be sped up?大家对如何加快速度有任何想法吗？ Thanks.谢谢。

Answer 1

Wow, I found something really interesting.哇，我发现了一件非常有趣的事情。 I saw this link in the "related" section about indexing vectors being inefficient in Matlab sometimes, so I decided to try a for loop.我在“相关”部分看到了这个链接，关于索引向量有时在 Matlab 中效率低下，所以我决定尝试一个 for 循环。 This code ended up being an order of magnitude faster!这段代码最终快了一个数量级！

for i = 1:n
    if index_vector(i) > 0
        values(i) = value_vector(index_vector(i));
    end
end

EDIT: Another interesting thing, unfortunately detrimental to my problem though.编辑：另一件有趣的事情，不幸的是对我的问题不利。 The speed of this solution depends on the amount of zeros in the index_vector.此解决方案的速度取决于 index_vector 中零的数量。 With index_vector = randi([0 200]);使用index_vector = randi([0 200]); , a small proportion of the values are zeros, but if I try index_vector = randi([0 1]) , approximately half of the values will be zero and then the above for loop is actually an order of magnitude slower. ，一小部分值为零，但如果我尝试index_vector = randi([0 1]) ，大约一半的值将为零，然后上述 for 循环实际上慢了一个数量级。 However, using ~= instead of > speeds the loop back up so that it's on a similar order of magnitude.但是，使用~=而不是>加快循环速度，使其处于类似的数量级。 Very interesting and odd behavior.非常有趣和奇怪的行为。

Answer 2

if you stick to matlab and the flow of the algorithm you want , and not doing this in fortran or c, here's a small start:如果你坚持使用 matlab 和你想要的算法流程，而不是在 fortran 或 c 中这样做，这里有一个小小的开始：

change the randi to rand, and round by casting to uint8 and use the > logical operation that for some reason is faster at my end将randi更改为 rand，并通过强制转换为uint8舍入，并使用>逻辑运算，该运算由于某种原因在我结束时速度更快

to sum up:总结：

value_vector = rand(m, 1 );
index_vector = uint8(-0.5+201*rand(n,1) );
values = zeros(n, 1);
values=value_vector(index_vector(index_vector>0));

this improved at my end by a factor 1.6这在我结束时提高了 1.6 倍

如何加速这个已经矢量化的 matlab 代码

问题描述

2 个解决方案

解决方案1
1 2020-10-14 13:58:53

解决方案2
0 2020-10-13 23:48:15

如何加速这个已经矢量化的 matlab 代码

问题描述

2 个解决方案

解决方案1 1 2020-10-14 13:58:53

解决方案2 0 2020-10-13 23:48:15

解决方案1
1 2020-10-14 13:58:53

解决方案2
0 2020-10-13 23:48:15