简体   繁体   English

将值附加到空列表还是将值分配给预定义数组?

[英]Appending value to empty list or allocating value to pre-defined array?

In Python, for several applications I normally have to store values to an array, like: 在Python中,对于几种应用程序,我通常必须将值存储到数组中,例如:

results = []
for i in range(num_simulations):
    ...<calculate results_new>...
    results.append(results_new)

Yet I have seen most of other sample codes declaring a zero-value array first: 但是我看到大多数其他示例代码都首先声明了一个零值数组:

results = np.zeros(len(num_simulations))
for i in range(num_simulations):
    ...<calculate results_new>...
    results[i] = results_new

Which one is better for common practice? 哪种做法更适合常规做法? Or even if you have to make a performance comparison, is there really a significant difference in time and memory between the two methods? 还是即使必须进行性能比较,两种方法之间在时间和内存上是否真的存在显着差异?

DISCLAIMER: I more or less only use Python codes for simulations, and hence just want to achieve better practice as I go along. 免责声明:我或多或少只使用Python代码进行模拟,因此我只是想在实践中获得更好的实践。

There are a few things you should know about using numpy arrays: 关于使用numpy数组,您应该了解以下几点:

  1. If you will be playing around with matrices, they are way faster. 如果您要使用矩阵,它们的速度会更快。 In your application, for just storing the data, they do not provide any exceptional additional benefit 在您的应用程序中,仅存储数据不会带来任何额外的额外好处
  2. HOWEVER, initializing them comes at both a cost and a benefit. 但是,初始化它们既有代价,也有好处。 The benefit of using pre allocated space is that you need not worry about running into memory issues, etc. The con is that the overload of allocating that big a memory space is more significant (See end of code) 使用预先分配的空间的好处是您不必担心会遇到内存问题等。缺点是分配那么大的内存空间的重载更为重要(请参见代码结尾)。

So in your application, if you are just storing the results in a list, and not performing any numerical methods, then it is fine if you do not use numpy. 因此,在您的应用程序中,如果您只是将结果存储在列表中,而不执行任何数值方法,那么如果您不使用numpy,那就很好。 In fact, it is more efficient to do so, as seen below 实际上,这样做更有效,如下所示

In [29]: %%timeit
    ...: results=[]
    ...: num_simulations=10000
    ...: for i in range(num_simulations):
    ...:     results.append(i)
    ...: 
1000 loops, best of 3: 984 µs per loop

In [30]: %%timeit
    ...: num_simulations = 10000
    ...: results=np.zeros(num_simulations)
    ...: for i in range(num_simulations):
    ...:     results[i]=i
    ...: 
1000 loops, best of 3: 1.44 ms per loop

In [31]: %%timeit
    ...: results=[]
    ...: num_simulations=100000
    ...: for i in range(num_simulations):
    ...:     results.append(i)
    ...: 
100 loops, best of 3: 10.1 ms per loop

In [32]: %%timeit
    ...: num_simulations = 100000
    ...: results=np.zeros(num_simulations)
    ...: for i in range(num_simulations):
    ...:     results[i]=i
    ...: 
100 loops, best of 3: 15.4 ms per loop

In [33]: %%timeit
    ...: results=[]
    ...: num_simulations=1000000
    ...: for i in range(num_simulations):
    ...:     results.append(i)
    ...: 
10 loops, best of 3: 103 ms per loop

In [34]: %%timeit
    ...: num_simulations = 1000000
    ...: results=np.zeros(num_simulations)
    ...: for i in range(num_simulations):
    ...:     results[i]=i
    ...: 
10 loops, best of 3: 156 ms per loop

Just to sum up the results, 只是总结一下结果,

Normal list vs     Numpy
984         vs     1440 microsecond     for 10000 simulations
10.1        vs     15.4 millisecond     for 100000 simulations
103         vs     156 millisecond      for 1000000 simulations

Evidently, using just lists, purely for storing is faster, as it does not involve heap memory allocation overhead 显然,仅使用列表,纯粹用于存储会更快,因为它不涉及堆内存分配开销

However, for pretty much any and every numerical method you will want to perform on a matrix, Numpy offers a much more overshadowing benefit 但是,对于您想要在矩阵上执行的几乎所有数值方法,Numpy都提供了更大的优势

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM