Python 元組分配與列表附加

Question

考慮運行時（Big O）和 memory 用法時，以下哪一個代碼更有效？

代碼 1：

a = []

for item in some_data:
   a.append(item.id)
   # some other code

print(a)

案例二：

a = tuple()

for item in some_data:
   a += (item.id,)
   # some other code

print(a)

這里： some_data可以是1個或n個數據。

我的猜測是Code 2是高效的，因為它使用較少的 memory 並且可能在分配操作中從堆棧內存中輸入和彈出數據。

我認為代碼 1效率較低，因為通常會列出超過分配的 memory，並且在附加數據時，當分配的內存超出時，它必須找到新的 memory 地址。

順便說一句，我只是數據結構和算法的初學者，不知道 python 如何管理 memory 中的變量。

Answer 1

考慮到 memory 用法，我會說列表更好。

在線上

a += (item.id,)

你基本上做的是a = a + (item.id,) （我在做捷徑，但有一些小區別。）

為此，有 4 個操作：

創建一個元組 => (item.id,)
合並 2 個元組 => a + (item.id,)
- 創建一個更大的元組
- 插入a里面
- 在里面插入(item.id,)

創建新的 object（這里是元組）是最耗時的。 每次迭代完成 2 次。

另一方面，附加一個列表。=創建一個新列表，因此在帶有列表的示例中，沒有創建（ a = []除外）

考慮執行時間：

In [1]: some_data = list(range(10000))                                                                                                                                                                                 

In [2]: %%timeit
        a = tuple()

        for item in some_data:
            a += (item,)                                                                                                                                                                                             
Out[2]: 151 ms ± 1.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)



In [3]: %%timeit
        a = []

        for item in some_data:
            a.append(item)                                                                                                                                                                                            
Out[3]: 406 µs ± 3.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [4]: %%timeit
        a = [item for item in some_data]  
                                                                                                                                                                                      
Out[4]: 154 µs ± 392 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

所以列表理解比元組快 1000 倍。

Answer 2

我為基准時間和 memory 用法編寫了簡單的腳本。

import time
import functools
from memory_profiler import profile


def timer(func):
    @functools.wraps(func)
    def wrapper_timer(*args, **kwargs):
        start_time = time.perf_counter()

        value = func(*args, **kwargs)

        end_time = time.perf_counter()

        run_time = end_time - start_time

        print(f"Finished {func.__name__!r} in {run_time:.4f} seconds")

        return value

    return wrapper_timer


LOOPS = 100000


@timer
def test_append():
    sample = []
    for i in range(LOOPS):
        sample.append(i)


@timer
def test_tuple():
    sample = tuple()
    for i in range(LOOPS):
        sample += (i, )


@profile(precision=2)
def main():
    test_append()
    test_tuple()


if __name__ == '__main__':
    main()

當LOOPS為100000時

Finished 'test_append' in 0.0745 seconds
Finished 'test_tuple' in 22.3031 seconds

Line #    Mem usage    Increment   Line Contents
================================================
73    38.00 MiB    38.00 MiB   @profile(precision=2)
74                             def main():
75    38.96 MiB     0.97 MiB       test_append()
76    39.10 MiB     0.13 MiB       test_tuple()

當 LOOPS 為1000時

Finished 'test_append' in 0.0007 seconds
Finished 'test_tuple' in 0.0019 seconds

Line #    Mem usage    Increment   Line Contents
================================================
73    38.04 MiB    38.04 MiB   @profile(precision=2)
74                             def main():
75    38.04 MiB     0.00 MiB       test_append()
76    38.04 MiB     0.00 MiB       test_tuple()

所以 append 比元組快但占用更多 memory

Python 元組分配與列表附加

問題描述

2 個解決方案

解決方案1
2 已采納 2020-08-19 10:05:38

解決方案2
1 2020-08-19 10:22:58

Python 元組分配與列表附加

問題描述

2 個解決方案

解決方案1 2 已采納 2020-08-19 10:05:38

解決方案2 1 2020-08-19 10:22:58

解決方案1
2 已采納 2020-08-19 10:05:38

解決方案2
1 2020-08-19 10:22:58