[英]Fastest way to compare a list to a dict of lists
所以我有两个清单:
list1 = ['abc', 'efg', 'hijk'] #list of strings
list2 = ['lmno', 'pqrs'] #also a list of strings
然后我有一个通常很大的字典,只有约100个键和数十万个字符串值填充列表
d = {'abc': ['lmno'], 'efg': ['lmno', 'pqrs']}
所以我需要遍历list1的每个项目和list2的每个项目:
例:
for i1 in list1:
for i2 in list2:
print(i1, i2)
然后将数据与字典进行比较:
for i1 in list1:
for i2 in list2:
if i1.lower() in d:
if i2 in d[i1.lower()]:
continue #ignore
else:
#process data
目前,我的代码与上面类似,但是当dict很大时,它会非常慢,是否有更快的方法呢?
for i1 in list1:
for i2 in list2:
if i1.lower() in d:
if i2 in d[i1.lower()]:
continue #ignore
else:
#process data
交换第二行和第三行,如果i1.lower()
不在d
则不会遍历list2
。
for i1 in list1:
if i1.lower() in d:
for i2 in list2:
if i2 in d[i1.lower()]:
continue #ignore
else:
#process data
另外,如@ aran-fey所述,首先将d
转换为set的字典:
d = {k: set(v) for k, v in d.items()}
更进一步(感谢@AlexHall):
d = {k: set(v) for k, v in d.items()}
set2 = {i2.lower() for i2 in list2}
for i1 in list1:
for i2 in set2 - d.get(i1.lower(), set()):
#process data
我猜您有两个列表,一个包含键,另一个包含值。 您需要在遍历值之前检查字典中的键,这将使此操作更有效。
for i1 in list1:
if i1.lower() in d:
for i2 in list2:
if i2 in d[i1.lower()]:
continue #ignore
else:
#process data
也许不是最快的,您必须检查一下。 但是,它更整洁。
from operator import itemgetter
keys_to_check = [
'abc', 'efg', 'hijk'
]
strings_to_check = [
'lmno', 'pqrs'
]
d = {
'abc': ['lmno'],
'efg': ['lmno', 'pqrs']
}
# Makes function that will get values for specified keys
# . Checks if the key is within dictionary
values = itemgetter(*(key.lower() for key in keys_to_check if key.lower() in d))
for value in values(d):
# Checks if any fo strings within value is in the strings_to_check
# . if so, ignore that value
if any(strng in strings_to_check for strng in value):
continue
else:
# process data
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.