![](/img/trans.png)
[英]Python - Fastest way to check if a string contains specific characters in any of the items in a list
[英]Fastest way to check if a string can be created with a list of characters in python
我需要检查是否可以使用字符列表创建字符串并返回True或False。
我正在使用list.count或collections.Counter的不同解决方案。
我也在使用这个解决方案,我不需要通读字符列表:
def check(message, characters):
try:
[list(characters).remove(m) for m in message]
return True
except:
return False
有最快的方式吗? 对于一个非常非常大的角色列表。 计数器和列表计数似乎较慢。 不知道是否有快速的pythonic方式来做到这一点。
例:
message = "hello"
characters = "hheellooasdadsfgfdgfdhgfdlkgkfd"
check(message, characters) # this should return True or False
# characters can be a veeeeery long string
重复事项 ,例如characters =“hheloo”不适用于message =“hello”
您可以使用collections.Counter()
。 只需构建两个计数器并使用subtract()
方法检查是否有任何负数:
>>> c1 = Counter(characters)
>>> c2 = Counter(message)
>>> c1.subtract(c2)
>>> all(v >= 0 for v in c1.values())
False
这应该在线性时间内起作用。
这在线性时间内是不可行的,因为两个字符串的长度都很重要,并且需要为每个字符迭代它们。 在没有检查其实际实现的情况下,我假设remove()
是对数的。
def check(msg, chars):
c = list(chars) # Creates a copy
try:
for m in msg:
c.remove(m)
except ValueError:
return False
return True
if __name__ == '__main__':
print(check('hello', 'ehlo'))
print(check('hello', 'ehlol'))
print(check('hello', 'ehloijin2oinscubnosinal'))
与eugene的解决方案和jbndlr的解决方案相比,这是另一种解决方案。
def test1(input_word, alphabet):
alp_set = set(list(alphabet))
in_set = set(list(input_word))
return in_set.issubset(alp_set)
def test2(input_word, alphabet):
c1 = collections.Counter(alphabet)
c2 = collections.Counter(input_word)
c1.subtract(c2)
return all(v >= 0 for v in c1.values())
def check(msg, chars):
c = list(chars) # Creates a copy
try:
for m in msg:
c.remove(m)
except ValueError:
return False
return True
input_word = "hello"
alphabet = "hheellooasdadsfgfdgfdhgfdlkgkfd"
start_time = time.time()
for i in range(10000):
test1(input_word,alphabet)
print("--- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
for i in range(10000):
test2(input_word,alphabet)
print("--- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
for i in range(10000):
check(input_word,alphabet)
print("--- %s seconds ---" % (time.time() - start_time))
>> --- 0.03100299835205078 seconds ---
>> --- 0.24402451515197754 seconds ---
>> --- 0.022002220153808594 seconds ---
⇒jbndlr的解决方案是最快的 - 对于这个测试用例。
另一个测试用例:
input_word = "hellohellohellohellohellohellohellohellohellohellohellohellohello"
alphabet =
“”
>> --- 0.21964788436889648 seconds ---
>> --- 0.518169641494751 seconds ---
>> --- 1.3148927688598633 seconds ---
⇒test1是最快的
有一种更快的方法可以做到这一点,显然是由于创建all()生成器的成本( 为什么Python的'all'函数如此慢? )也许for循环更快,扩展@eugene y的答案:
from collections import Counter
import time
message = "hello"
characters = "hheeooasdadsfgfdgfdhgfdlkgkfd"
def check1(message,characters):
c1 = Counter(characters)
c2 = Counter(message)
c1.subtract(c2)
return all(v > -1 for v in c1.values())
def check2(message,characters):
c1 = Counter(characters)
c2 = Counter(message)
c1.subtract(c2)
for v in c1.values():
if v < 0:
return False
return True
st = time.time()
for i in range(350000):
check1(message,characters)
end = time.time()
print ("all(): "+str(end-st))
st = time.time()
for i in range(350000):
check2(message,characters)
end = time.time()
print ("for loop: "+str(end-st))
结果:
all(): 5.201688051223755
for loop: 4.864434719085693
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.