使用 numpy arrays 乘法和“附加”更快地进行 FOR 循环

Question

This is my current code.这是我当前的代码。

It works fine when the number of elements in the second "for loop" is low (around 10k) and it takes only a few seconds, but when he number of elements in the second "for loop" is high (around 40k) the time it takes is around 60 seconds or more: why?当第二个“for 循环”中的元素数量很少（大约 10k）并且只需要几秒钟时它工作正常，但是当第二个“for 循环”中的元素数量很高（大约 40k）时大约需要 60 秒或更长时间：为什么？

For example: sometimes when the 2nd for loop has 28k elements it takes less time to execute it than when it has 7k elements.例如：有时当第二个 for 循环有 28k 个元素时，执行它所花费的时间比它有 7k 个元素时要少。 I don't understand why the time isn't linearly dependent on the number of operations.我不明白为什么时间与操作次数不是线性相关的。

Also , as a general rule, the longer the code runs, the bigger the loop times become.此外，作为一般规则，代码运行的时间越长，循环时间就越长。

To recap , the execution times usually follow these rules:回顾一下，执行时间通常遵循以下规则：

when operations < 10k, time < 5 seconds当操作 < 10k 时，时间 < 5 秒
when 10k < operations < 40k, 10s < time < 30s (seems random)当 10k < operations < 40k, 10s < time < 30s (似乎是随机的)
when operations > 40k, time ~ 60 seconds当操作 > 40k 时，时间 ~ 60 秒

. .

from random import random
import numpy as np
import time
import gc
from collections import deque
import random

center_freq = 20000000 
smpl_time = 0.03749995312*pow(10,-6) 

mat_contents = []

for i in range(10):
    mat_contents.append(np.ones((3600, random.randint(3000,30000))))

tempo = np.empty([0,0])

for i in range(3600):
    tempo = np.append(tempo, center_freq*smpl_time*i)

ILO = np.cos(tempo) 

check = 0

for element in mat_contents:
    start_time = time.time()
    Icolumn = np.empty([0,0])
    Imatrix = deque([])
    gc.disable()

    for colonna in element.T:
        Imatrix.append(np.multiply(ILO, colonna))

    gc.enable()
    varI = np.var(Imatrix)
    tempImean = np.mean(Imatrix)

    print('\nSize: ', len(element.T))
    print("--- %s seconds ---" % (time.time() - start_time))
    print("---  ", check, ' ---')
    check += 1

Answer 1

Your code as is killed my session, presumably because the array dimension(s) was too big for memory. Trying again with smaller numbers:你的代码杀死了我的 session，大概是因为数组维度对于 memory 来说太大了。再次尝试使用较小的数字：

   In [1]: from collections import deque
   ...: import random
   ...: 
   ...: center_freq = 20000000
   ...: smpl_time = 0.03749995312*pow(10,-6)
   ...: 
   ...: mat_contents = []
   ...: 
   ...: for i in range(10):
   ...:     mat_contents.append(np.ones((36, random.randint(30,300))))
   ...: 
   ...: tempo = np.empty([0,0])
   ...: 
   ...: for i in range(3600):
   ...:     tempo = np.append(tempo, center_freq*smpl_time*i)
   ...: 
   ...: ILO = np.cos(tempo)
   ...: 
   ...: check = 0
In [2]: 
In [2]: tempo.shape
Out[2]: (3600,)
In [3]: ILO
Out[3]: 
array([ 1.        ,  0.73168951,  0.07073907, ..., -0.63602366,
       -0.99137119, -0.81472814])
In [4]: len(mat_contents)
Out[4]: 10

A quick note - while list append in your mat_contents loop is relatively fast, the np.append in the next loop sucks bad.快速说明 - 虽然mat_contents循环中的列表 append 相对较快，但下一个循环中的np.append糟糕。 Don't use it that way.不要那样使用它。

I don't know if I have time now to look at the next loop, BUT, why are you using a deque there?我不知道我现在是否有时间看下一个循环，但是，你为什么在那里使用deque ？ Why not a list?为什么不是列表？ I haven't worked with a deque much, but I don't think it has any advantages over list with this right side append .我没有太多地使用deque ，但我认为它与右侧append的列表相比没有任何优势。

Correcting ILO for the reduced shape修正ILO以减少形状

In [16]:         tempo = np.empty([0,0])
    ...:    ...:
    ...:    ...: for i in range(36):
    ...:    ...:     tempo = np.append(tempo, center_freq*smpl_time*i)
    ...: 
In [17]: tempo.shape
Out[17]: (36,)
In [18]: ILO = np.cos(tempo)

I suspect that loop can replaced by one numpy expression, but I won't dig into that yet.我怀疑循环可以用一个numpy表达式代替，但我不会深入研究它。

In [19]: %%timeit
    ...: for element in mat_contents:
    ...:     Icolumn = np.empty([0,0])
    ...:     Imatrix = deque([])
    ...:     for colonna in element.T:
    ...:         Imatrix.append(np.multiply(ILO, colonna))
    ...:     varI = np.var(Imatrix)
    ...:     tempImean = np.mean(Imatrix)
    ...: 
5.82 ms ± 6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Time with Imatrix = [] is the same, so using deque doesn't hurt. Imatrix = []的时间是一样的，所以使用deque没有坏处。

mat_contents contains arrays with varying shape (2nd dimension): mat_contents包含 arrays 具有不同的形状（第二维）：

In [22]: [x.shape for x in mat_contents]
Out[22]: 
[(36, 203),
 (36, 82),
 (36, 194),
 (36, 174),
 (36, 119),
 (36, 190),
 (36, 272),
 (36, 68),
 (36, 293),
 (36, 248)]

Now lets examine the slow loop that you are worried about:现在让我们检查一下您担心的慢循环：

In [23]: ILO.shape
Out[23]: (36,)
In [24]: element=mat_contents[0]
In [25]: element.T.shape
Out[25]: (203, 36)
In [26]: Imatrix =[]
    ...: for colonna in element.T:
    ...:         Imatrix.append(np.multiply(ILO, colonna))
    ...: arr = np.array(Imatrix)
In [27]: arr.shape
Out[27]: (203, 36)

To multiply a (36,) array and a (203,36) we don't need to loop.要将 (36,) 数组和 (203,36) 相乘，我们不需要循环。

In [28]: np.allclose(ILO*element.T, arr)
Out[28]: True
In [29]: %%timeit
    ...: Imatrix =[]
    ...: for colonna in element.T:
    ...:         Imatrix.append(np.multiply(ILO, colonna))
    ...: arr = np.array(Imatrix)
    ...: 
    ...: 
405 µs ± 2.72 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

The whole-array operation is much faster:整个数组操作要快得多：

In [30]: timeit ILO*element.T
19.4 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In your original question you loaded a .mat file.在您最初的问题中，您加载了一个.mat文件。 Are you trying to translate MATLAB code?您是否正在尝试翻译 MATLAB 代码？ numpy is like old-time MATLAB, where whole-matrix operations are much faster. numpy就像旧时代的 MATLAB，其中全矩阵运算要快得多。 numpy does not do jit compilation of loops. numpy不做循环的jit编译。

使用 numpy arrays 乘法和“附加”更快地进行 FOR 循环

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-05-08 19:46:54

使用 numpy arrays 乘法和“附加”更快地进行 FOR 循环

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-05-08 19:46:54

解决方案1
1 已采纳 2022-05-08 19:46:54