[英]python find all matches of multiple strings in a given string using regex?
[英]python given query string find a set of strings with same beginning
编辑:我感谢所有答案,但谁能告诉我为什么我的解决方案不起作用? 我想尝试在没有 .startswith() 的情况下执行此操作,谢谢!
我正在尝试完成这个练习:
实施自动完成系统。 也就是说,给定一个查询字符串和一组所有可能的查询字符串,返回集合中所有以 s 为前缀的字符串。 例如,给定查询字符串 de 和字符串集合 [dog, deer, deal],返回 [deer, deal]。 提示:尝试将字典预处理为更有效的数据结构以加快查询速度。
但我得到一个空列表。 我可能做错了什么? 我以为这会给我[鹿,交易]
def autocomplete(string,set):
string_letters = []
letter_counter = 0
list_to_return = []
for letter in string:
string_letters.append(letter)
for words in set:
for letter in words:
if letter_counter == len(string):
list_to_return.append(words)
if letter == string_letters[letter_counter]:
letter_counter += 1
else:
break
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))
output:
[]
编辑:我感谢所有答案,但谁能告诉我为什么我的解决方案不起作用? 我想尝试在没有 .startswith() 的情况下执行此操作,谢谢!
以下是我将如何完成您正在尝试做的事情:
import re
strings = ['dog', 'deer', 'deal']
search = 'de'
pattern = re.compile('^' + search)
[x for x in strings if pattern.match(x)]
结果: ['deer', 'deal']
但是,在大多数情况下,对于这样的用例,您可能希望忽略搜索字符串和搜索字段的大小写。
import re
strings = ['dog', 'Deer', 'deal']
search = 'De'
pattern = re.compile('^' + search, re.IGNORECASE)
[x for x in strings if pattern.match(x)]
结果: ['Deer', 'deal']
要回答为什么您的代码不起作用的部分原因,它有助于在代码中添加一些详细信息:
def autocomplete(string,set):
string_letters = []
letter_counter = 0
list_to_return = []
for letter in string:
string_letters.append(letter)
for word in set:
print(word)
for letter in word:
print(letter, letter_counter, len(string))
if letter_counter == len(string):
list_to_return.append(word)
if letter == string_letters[letter_counter]:
letter_counter += 1
else:
print('hit break')
break
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))
Output:
dog
('d', 0, 2)
('o', 1, 2)
hit break
deer
('d', 1, 2)
hit break
deal
('d', 1, 2)
hit break
[]
正如您在 output 中看到的狗“匹配但 o 不匹配”,这使得 letter_counter 为 1,然后在 deer 'd.='e' 上它会中断......这会一遍又一遍地持续下去。 有趣的是,由于这种行为,设置 'ddeer' 实际上会匹配,要解决这个问题,您需要在 for 循环中重置 letter_counter。 并有额外的断点来防止过度修改你的索引。
def autocomplete(string,set):
string_letters = []
list_to_return = []
for letter in string:
string_letters.append(letter)
for word in set:
# Reset letter_counter as it is only relevant to this word.
letter_counter = 0
print(word)
for letter in word:
print(letter, letter_counter, len(string))
if letter == string_letters[letter_counter]:
letter_counter += 1
else:
# We did not match break early
break
if letter_counter == len(string):
# We matched for all letters append and break.
list_to_return.append(word)
break
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))
我注意到了提示,但没有将其说明为要求,因此:
def autocomplete(string,set):
return [s for s in set if s.startswith(string)]
print(autocomplete("de", ["dog","deer","deal"]))
str.startswith(n)
将返回 boolean 值,如果 str 以n
开头,则返回True
,否则返回False
。
您可以只使用startswith
字符串function并避免所有这些计数器,如下所示:
def autocomplete(string, set):
list_to_return = []
for word in set:
if word.startswith(string):
list_to_return.append(word)
return list_to_return
print(autocomplete("de", ["dog","deer","deal"]))
简化。
def autocomplete(string, set):
back = []
for elem in set:
if elem.startswith(string[0]):
back.append(elem)
return back
print(autocomplete("de", ["dog","deer","deal","not","this","one","dasd"]))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.