在詞典列表中搜索鍵值

Question

我有一個包含多個字典l = [ d1, d2, ...., d100]的列表l ，其中每個字典都用鍵'id' ， 'address'和'price' 。 現在，我想從列表l獲取所有鍵d等於'price'值等於50的字典。有沒有比使用for循環更快的方法了？ 此處理已經封裝在其他for循環函數中，因此，如果可能的話，我希望不要有兩個for循環。 該函數的框架現在看起來如下：

for ... (external for loop):
    results = []
    for d in l:
        if d['price'] == 50:
           results.append(d)

Answer 1

您可以使用列表推導：

results = [d for d in l if d['price'] == 50]

從算法上講，這與循環沒有什么不同（它還必須迭代整個列表），但是理解在C中得到了優化，因此速度更快。 另一種選擇是使results成為惰性迭代器：

# generator expression
results = (d for d in l if d['price'] == 50)

# filter (not the most elegant/readable with lambda)
results = filter(lambda d: d['price'] == 50, l)

這將不會在聲明時完全迭代該list 。 它只會在迭代results時這樣做（只能重復一次）。 如果您不總是或僅需要部分迭代results則這可能會有所幫助。

Answer 2

除非您知道列表的結構（例如按價格排序，或者只有三個項目可以具有這樣的價格），否則我們無法使算法的速度快於O（n） （線性時間）。 因此，我們將不得不循環。

清單理解

我們可以像這樣使用列表推導：

[d for d in l if d.get('price') == 50]

（這也會過濾掉沒有價格屬性的字典）。

大熊貓

我們也可以使用熊貓。 Pandas是用於數據幀的高效庫，鑒於數據量巨大，它往往勝過Python循環。 在這種情況下，我們可以將字典加載到datframe中，對其進行過濾，然后檢索字典列表。 請注意，這些將是不同的字典（即，包含相同數據的其他對象）。 因此數據被“復制”。

import pandas as pd
df = pd.DataFrame(l)
result = list(df[df.price == 50].T.to_dict().values())

因此，這里我們使用df.price == 50過濾。 注意，在幕后有一些循環需要進行濾波。

這也是一種更具聲明性的方法：代碼解釋了它在做什么，而不是如何做。 熊貓如何進行過濾並不是您要解決的問題，並且語法非常優美地表明您正在過濾數據。

Answer 3

tl; dr-使用列表推導不會出錯

我探索了以下方法：

基本for循環
清單理解與if比較
帶有lambda表達式的內置過濾器功能
生成器列表理解

這些方法已在Python 2.7.12和Python 3.5.2（不是最新版本）中進行了探討。 似乎在Python 2中，最好的方法是方法4，而在python 3中，最好的方法是方法2（至少對於我的版本，這也不是最新的）。

以下是Python 2.7.12的結果：

# 2.7.12
# [GCC 5.4.0 20160609]
# Method 1 found 496 item in 0.382161 seconds. (basic for-loop)
# Method 2 found 496 item in 0.365456 seconds. (list comprehension)
# Method 3 found 496 item in 0.565614 seconds. (built in filter function)
# Method 4 found 496 item in 0.273335 seconds. (list comprehension over a generator expression)

以下是Python 3.5.2的結果：

# 3.5.2 
# [GCC 5.4.0 20160609]
# Method 1 found 493 item in 0.500266 seconds. (basic for-loop)
# Method 2 found 493 item in 0.338361 seconds. (list comprehension)
# Method 3 found 493 item in 0.796027 seconds. (built in filter function)
# Method 4 found 493 item in 0.351668 seconds. (list comprehension over a generator expression)

這是用於獲取結果的代碼：

import time
import random
import sys

print(sys.version)

l = []
for i in range(10000):
    d = {'price': random.randint(40, 60), 'id': i}
    l.append(d)

#METHOD 1 - basic for-loop
start = time.time()
for _ in range(1000):
    results = []
    for d in l:
        if d['price'] == 50:
           results.append(d)
end = time.time()
print("Method 1 found {} item in {:f} seconds. (basic for-loop)".format(len(results), (end - start)))

#METHOD 2 - list comp with if statement
start = time.time()
results = []
for _ in range(1000):
    results = []
    results = [d for d in l if d['price'] == 50]
end = time.time()
print("Method 2 found {} item in {:f} seconds. (list comprehension)".format(len(results), (end - start)))

#METHOD 3 - using filter and a lambda expression
start = time.time()
results = []
for _ in range(1000):
    results = []
    results = list(filter(lambda d: d['price'] == 50, l))
end = time.time()
print("Method 3 found {} item in {:f} seconds. (built in filter function)".format(len(results), (end - start)))

#METHOD 4 - list comp over generator expression
start = time.time()
results = []
once = True
for _ in range(1000):
    results = []
    genResults = (d for d in l if d['price'] == 50)
    results = [it for it in genResults]
end = time.time()
print("Method 4 found {} item in {:f} seconds. (list comprehension over a generator expression)".format(len(results), (end - start)))

在詞典列表中搜索鍵值

問題描述

3 個解決方案

解決方案1
3 2018-01-07 16:08:20

解決方案2
3 已采納 2018-01-07 16:15:36

清單理解

大熊貓

解決方案3
2 2018-01-07 18:50:40

在詞典列表中搜索鍵值

問題描述

3 個解決方案

解決方案1 3 2018-01-07 16:08:20

解決方案2 3 已采納 2018-01-07 16:15:36

清單理解

大熊貓

解決方案3 2 2018-01-07 18:50:40

解決方案1
3 2018-01-07 16:08:20

解決方案2
3 已采納 2018-01-07 16:15:36

解決方案3
2 2018-01-07 18:50:40