哪個更快？檢查某些內容是否在Python列表中？即會員資格與非會員資格

Question

對於那些比我更了解計算機科學的人來說，這可能是一個菜鳥問題，也可能是顯而易見的。 也許這就是為什么我在搜索之后找不到Google或SO的任何內容。 也許我沒有使用正確的詞匯。

標題說明了一切。 如果我知道x大部分時間都在my_list中，那么以下哪個更快？

if x in my_list:
    func1(x)
else:
    func2(x)

要么

if x not in my_list:
    func2(x)
else:
    func1(x)

列表的大小是否重要？ 例如10個元素與10,000個元素？ 對於我的特殊情況， my_list由字符串和整數組成，但有沒有人知道其他考慮是否適用於更復雜的類型，如dicts？

謝謝。

Answer 1

檢查元素是否在列表中，或者元素是否x in my_list中調用相同操作x in my_list的列表x in my_list ，因此應該沒有任何區別。

列表的大小是否重要？

檢查元素是否在列表中是O（N）操作，這意味着大小確實很重要，大致成比例。

如果你需要做很多檢查，你可能想查看set ，檢查一個元素是否在一個set是O（1），這意味着檢查時間不會隨着set大小增加而改變太多。

Answer 2

應該沒有明顯的性能差異。 你最好不要寫任何一個讓你的代碼更具可讀性的文章。 任何一個都是O（n）復雜度，並且主要取決於元素在列表中的位置。 此外，您應該避免過早優化，對大多數用例來說無關緊要，如果確實如此，通常最好使用其他數據結構。

如果要以更快的性能進行查找，請使用dicts，它們可能具有O（1）復雜性。 有關詳細信息，請參閱https://wiki.python.org/moin/TimeComplexity 。

Answer 3

Python包含一個模塊和函數timeit ，它可以告訴你執行代碼片段需要多長時間。 片段必須是單個語句，這樣就不會像if一樣直接計算復合語句if但是我們可以將語句包裝在函數中並為函數調用計時。

比調用timeit.timeit()更容易使用一個jupyter筆記本並在一行的開頭使用魔術%timeit magic語句。

這證明了長期列表或簡短，成功或失敗，您詢問的兩種方式， in alist檢查in alist還是not in alist測量的可變性內給出相同的時間。

import random

# set a seed so results will be repeatable
random.seed(456789)

# a 10K long list of junk with no value greater than 100
my_list = [random.randint(-100, 100) for i in range(10000)]

def func1(x):
    # included just so we get a function call
    return True

def func2(x):
    # included just so we get a function call
    return False

def way1(x):
    if x in my_list:
        result = func1(x)
    else:
        result = func2(x)
    return result

def way2(x):
    if x not in my_list:
        result = func2(x)
    else:
        result = func1(x)
    return result

%timeit way1(101) # failure with large list

The slowest run took 8.29 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 207 µs per loop

%timeit way1(0) # success with large list

The slowest run took 7.34 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.04 µs per loop

my_list.index(0)

186

%timeit way2(101) # failure with large list

The slowest run took 12.44 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 208 µs per loop

%timeit way2(0) # success with large list

The slowest run took 7.39 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.01 µs per loop

my_list = my_list[:10] # now make it a short list
print(my_list[-1]) # what is the last value

-37

# Run the same stuff again against the smaller list, showing that it is
# much faster but still way1 and way2 have no significant differences
%timeit way1(101) # failure with small list
%timeit way1(-37) # success with small list
%timeit way2(101) # failure with small list
%timeit way2(-37) # success with small list

The slowest run took 18.75 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 417 ns per loop
The slowest run took 13.00 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 403 ns per loop
The slowest run took 5.08 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 427 ns per loop
The slowest run took 4.86 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 386 ns per loop

# run the same again to get an idea of variability between runs so we can
# be sure that way1 and way2 have no significant differences
%timeit way1(101) # failure with small list
%timeit way1(-37) # success with small list
%timeit way2(101) # failure with small list
%timeit way2(-37) # success with small list

The slowest run took 8.57 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 406 ns per loop
The slowest run took 4.79 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 412 ns per loop
The slowest run took 4.90 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 412 ns per loop
The slowest run took 4.56 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 398 ns per loop

Answer 4

軟件實現中的一個期望特性是具有低耦合。 您的實現不應該通過Python解釋器測試列表成員資格的方式來定義，因為這是一種高級別的耦合。 可能是實施方式發生了變化，而且不再是更快的方式。

在這種情況下我們應該關注的是，對列表中的成員資格的測試與列表的大小是線性的。 如果需要更快的成員資格測試，您可以使用一組。

哪個更快？檢查某些內容是否在Python列表中？即會員資格與非會員資格

問題描述

4 個解決方案

解決方案1
4 已采納 2017-10-02 05:16:02

解決方案2
2 2017-10-02 05:20:23

解決方案3
1 2017-10-02 06:06:46

解決方案4
0 2017-10-02 05:23:43

哪個更快？ 檢查某些內容是否在Python列表中？ 即會員資格與非會員資格

問題描述

4 個解決方案

解決方案1 4 已采納 2017-10-02 05:16:02

解決方案2 2 2017-10-02 05:20:23

解決方案3 1 2017-10-02 06:06:46

解決方案4 0 2017-10-02 05:23:43

哪個更快？檢查某些內容是否在Python列表中？即會員資格與非會員資格

解決方案1
4 已采納 2017-10-02 05:16:02

解決方案2
2 2017-10-02 05:20:23

解決方案3
1 2017-10-02 06:06:46

解決方案4
0 2017-10-02 05:23:43